Track disease progression with Python nltk and SQL

Question

Track disease progression with Python nltk and SQL

I have a lot of gigabytes of Facebook / Twitter / RSS data.

I use it to track, by general generalization, a generalized population of the evolution of hyperparathyroidism from someone who is diagnosing someone with the drugs they took, treatment methods and end results.

I am new to NLTK and I have excellent Python / SQL experience.

All my data is parathyroid ; however, as you can see below (data from the twitter example), this is linguistically terrible:

 omg i think my parathyroid is screwed up!!! Have been stuck at parathyroid hormone. STOP GETTING ON TWITTER JASMINE. Cryopreservation of Parathyroid Tissue after Parathyroid Surgery for Renal Hyperparathyroidism The Parathyroid as a Target for Radiation Damage it for the parathyroid hormone la

All this data is stored in a database. We also have fields like poster, zip code, message text, etc.

I was wondering if anyone could point me in the right direction for the following:

Are there effective algorithms to help me do what I need?
Linguistically, how can we find correlations in data? We are trying to track patterns.
Is there some kind of "mesh" form in which I have to put the data to help with the analysis?

+4

python algorithm sql postgresql twitter

l --''''''--------- '' '' '' '' '' '' ' Aug 24 '12 at 4:12

source share

No one has answered this question yet.

See related questions:

5504

Does Python have a ternary conditional operator?

5231

What are metaclasses in Python?

4473

Calling an external command in Python