Refine
Document Type
- Article (3)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- BLAST algorithm (1)
- Bacterial genomics (1)
- Genomic databases (1)
- Metagenomics (1)
- OSGi (1)
- Sequence alignment (1)
- accidental falls (1)
- cloud computing (1)
- data mining (1)
- distributed environments (1)
Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
OSGi in Cloud Environments
(2013)
Mining geriatric assessment data for in-patient fall prediction models and high-risk subgroups
(2012)
Background: Hospital in-patient falls constitute a prominent problem in terms of costs and consequences. Geriatric institutions are most often affected, and common screening tools cannot predict in-patient falls consistently. Our objectives are to derive comprehensible fall risk classification models from a large data set of geriatric in-patients’ assessment data and to evaluate their predictive performance (aim#1), and to identify high-risk subgroups from the data (aim#2).
Methods: A data set of n = 5,176 single in-patient episodes covering 1.5 years of admissions to a geriatric hospital were extracted from the hospital’s data base and matched with fall incident reports (n = 493). A classification tree model was induced using the C4.5 algorithm as well as a logistic regression model, and their predictive performance was evaluated. Furthermore, high-risk subgroups were identified from extracted classification rules with a support of more than 100 instances.
Results: The classification tree model showed an overall classification accuracy of 66%, with a sensitivity of 55.4%, a specificity of 67.1%, positive and negative predictive values of 15% resp. 93.5%. Five high-risk groups were identified, defined by high age, low Barthel index, cognitive impairment, multi-medication and co-morbidity.
Conclusions: Our results show that a little more than half of the fallers may be identified correctly by our model, but the positive predictive value is too low to be applicable. Non-fallers, on the other hand, may be sorted out with the model quite well. The high-risk subgroups and the risk factors identified (age, low ADL score, cognitive impairment, institutionalization, polypharmacy and co-morbidity) reflect domain knowledge and may be used to screen certain subgroups of patients with a high risk of falling. Classification models derived from a large data set using data mining methods can compete with current dedicated fall risk screening tools, yet lack diagnostic precision. High-risk subgroups may be identified automatically from existing geriatric assessment data, especially when combined with domain knowledge in a hybrid classification model. Further work is necessary to validate our approach in a controlled prospective setting.