【PS求改】CS狗/写的普实地掉渣/斟酌用词好难/给建议的加米

avatar 128261
Ryan_Panda
5412
29
参考大家的PS眼睛都看花了。 汪汪。
然后自己斟酌词句,有的时候一想一下午【其实想了5分钟就开始发呆了orz...

总算是写好了,又改了一遍。感觉暂时摸不着头脑,不知道改哪里好。
一共700字出头。
求各位看官走过路过的,帮小弟出出主意。
小弟给大家加米【虽然不多,聊表谢意!

[align="left"]I always admire data scientists like LarryPage who can dig treasures from masses of data. Thanks to the privilege for theTop 3 students in the honored class, I acquired a golden chance to enter alaboratory focusing on data mining in my sophomore years. Since then, I haveparticipated in multiple researches and gradually fallen in loving with datascience.[/align][align="left"] [/align][align="left"]This wonderful journey started from myparticipation in a bioinformatic research which managed to mine therelationship between genes and diseases from GeneRIF, short statements aboutthe function of genes. In the research, we did enrichment test on various genesto obtain lists of related diseases, ranked by enrichment test scores. In orderto testify the stability of rankings, I together with my teammates developed anevaluation program using C++ to repeatedly randomly sample data from database,add noise to them and evaluate the intensity of influence. Then I quantifiedthe evaluation results into scores and manually examined over 1000 data whosescores were over a threshold. Although the work seemed tedious andtime-consuming from others’ perspectives, I found this process incrediblyinteresting. To organize and mine those separate pieces of information was likehunting treasures, enjoying the pleasure of unknown and exploration. And Ireally thought that what our research team did was meaningful and marvelousbecause our work organized thousands of results of gene-related papers andwould largely assist biomedical experts in predicting effects and symptomsof certain gene mutations.[/align][align="left"] [/align][align="left"]This research experience not only ignitedmy passion in data science but also taught me how to conduct a research fromthe beginning to the end. It is invaluable especially as I began to takeindependent research seminars in my junior year. Inspired by an articlefocusing on finding the relationship between mood and surroundings such asdaylength and temperature, I wonder if there any relationship between mood andair quality, which was a hot topic in China. I acquired people’s mood data by crawlingpostings from Weibo, a Chinese Twitter-like microblogging website, and ascribingeach posting into positive/negative mood postings using a text analysissoftware program. In this process, I refined my crawler for multiple times,solved problems such as out of memory, stack overflow and disconnection fromdatabase and eventually acquired postings in 69 continuous days. Then Ivisualized and compared the trend of air quality and numbers ofpositive/negative mood postings. While their trend appeared irrelevant in thisresearch, perhaps because of noise in data, I still got endless joy from thewhole process. I became more experienced not only in programming but also inmethodology of searching, collecting and mining data. It further confirmed myresolution to pursue the field of data science.[/align][align="left"] [/align][align="left"]In my independent research, I was aware of theimportance of data pre-processing. Thus in later research, I always take greatcare in minimizing noise in data using mathematical and programming methods. Lastsemester, I took part in a research of analyzing ontology-driven co-occurrencebetween biomedical terms extracted from consumer generated healthcare contents.While I examined the original data, I found that these contents frequentlycontained spelling errors, which would bias the results on a large scale. Takenall the factors into consideration such as performance, suitability,open-source and convenience, my teammates and I decided to construct ourspelling correction system based on Google Spelling Checker API. The systemsuccessfully corrected 85.24% spelling errors, largely improving the accuracyof further data analysis. In addition, inspired by my formal work, I utilizedbiomedical ontologies to filter corrected words needing manual inspection,saving two thirds of the time I was supposed to have used. My work culminatedwith two papers respectively about the ontology-driven co-occurrence analysisand the spelling correction of consumer-generated healthcare contents.[/align][align="left"] [/align][align="left"]After participating in these researches, Ibegan to wonder if there was any mathematical or algorithmic analysis tool whichcan be applied in multiple situations. That’s why I joined in a researchconcerning deep learning. It can model various high-level abstractions in data andbe utilized in classification, regression, feature extracting and dimensionsreducing. Now I am trying to apply deep learninginto automatically classifying the large bioinformatics data of TCGA andextracting the latent feature of genome from TCGA. Though I am still unable tocompletely understanding and deftly utilizing this algorithm, the power andpotential of deep learning have amazed me and I truly hope to go on to study itin the graduate school.[/align][align="left"] [/align][align="left"]Your school is known for… which is exactlywhat I’m looking for. I really hope to join your school.[/align]
  • 6
29条回复