R text mining - investigate how companies contribute to scientific research
€8-30 EUR
În desfășurare
Data postării: peste 6 ani în urmă
€8-30 EUR
Plata la predare
The mission is to investigate how companies contribute to scientific research.
You will have to work with messy, real word data, which unfortunately has no unique identifiers that can be used to perform a perfect match. As a result, creative solutions are needed, perhaps with approximate string matching, conditioning, etc. For instance, a company “Baxter International Inc” has many entities, such as “Baxter International” OR “Baxter Healthcare Corporation”, both of which would be a correct match. So it is likely to be appropriate to search by the first word as a keyword. On the other hand, such search-by-first-word-as-a-keyword strategy will fail with a case, such as “T2 Biosystems, Inc”.
You will need to familiarise yourself with existing R packages (or possible Python equivalents) to connect to relevant AP - more details will be provided to shortlisted candidates in a specification file.
If the project is successfully completed, follow-up tasks and projects can be agreed upon, so there is a potential for more cooperation rather than a one time gig. Hopefully you find the task interesting and stimulating, and would enjoy learning new packages and working with the data on scientific publishing. I look forward you your response! Thanks.
Fuzzy matching company names by using R
Relevant Skills and Experience
R programming/Scientific research experiences
Proposed Milestones
€30 EUR - Time for matching the company times