biologeek's profile picture. Metabolomics, #Rstats and feature importance curiosity 🤔

Cristina

@biologeek

Metabolomics, #Rstats and feature importance curiosity 🤔

Fixado

Ok! PCA before + after correction done! Or should I say Dunn? :) Big thanks to @BroadhurstDavid for communicating their work. Current work based on Dunn et al., 2011 but in forthcoming analyses, a thing or 2 are eligible for update based on Broadhurst et al., 2018 #metabolomics

biologeek's tweet image. Ok! PCA before + after correction done!
Or should I say Dunn? :)
Big thanks to @BroadhurstDavid for communicating their work.
Current work based on Dunn et al., 2011 but in forthcoming analyses, a thing or 2 are eligible for update based on Broadhurst et al., 2018 #metabolomics

Cristina repostou

It never ceases to amaze me what people can make with gganimate #rstats

Este Tweet não está mais disponível.

Cristina repostou

#RStats — Can we scrape the online documentation of an API to automate the creation of an R wrapper 📦? Spoiler: yes. "Automate the Creation of an API Wrapper package by Scraping its Online Documentation" colinfay.me/fun-from-api-d…


Cristina repostou

I think dplyr::all_equal() should do most of that. Not sure about types


Cristina repostou

omg I am always using this now! usethis.r-lib.org/reference/git_…


Cristina repostou

upvoting qs- handles any R object and comparable to fst in speed. The main difference from fst is qs doesn’t support random access, eg how fst allows reading only specific cols/rows. But read/write speeds overall close. I think they share a bunch of implementation strategies.


Cristina repostou

Retweeting because I am really excited about this. I am willing to bet that 1) thinking about the next hypothesis you will test in machine readable terms will immediately improve what you are doing, and 2) better meta-data will make science massively more efficient.

New preprint with @LisaDeBruine where we make the case for machine readable hypothesis tests psyarxiv.com/5xcda/. We give a real-life example, argue this would improve the rigour and falsifiability of hypothesis tests, as well as facilitate the re-use of key info in articles.

lakens's tweet image. New preprint with @LisaDeBruine where we make the case for machine readable hypothesis tests psyarxiv.com/5xcda/. We give a real-life example, argue this would improve the rigour and falsifiability of hypothesis tests, as well as facilitate the re-use of key info in articles.


Cristina repostou

TIL: I learnt about the conflicted 📦 My filter function always gets masked, so my solution till today was dplyr::filter. But there is a better way! You can set your function:library preference at the top of your script! 😭🙏 e.g. conflict_prefer("filter", "dplyr") #rstats


Cristina repostou

A thread of classifiers learning a decision rule. Dashed line is optimal boundary. Animations with #gganimate by @thomasp85 and @drob. #rstats Logistic regression {stats::glm} with each class having normally distributed features. (1/n)


Cristina repostou

I finally got around to looking up the linear algebra of matrix rotations for my PCA explanation.


Cristina repostou

We explain the concept of calibration in the link below. In short, calibration is about the predicted risks (probabilities) that come out of your prediction model and whether or not these risks are consistent with the proportion of events you observed

Sorry for the shameless plug, but you might be interested in this: bmcmedicine.biomedcentral.com/articles/10.11…



Cristina repostou

Instead of referring to myself as self-taught, I'm gonna start referring to myself as community-taught. The sites, the blogs, the books, the user groups, the confs, the forums ... all community efforts that I used to learn and advance my programming and data science knowledge.


Cristina repostou

Computer: change your password Me: ********** Computer: new password does not meet requirements Me: **************** Computer: new password does not meet requirements Me: ************************** Computer: new password does not meet requirements Me:


Cristina repostou

The null-coalescing operator %||% is in the miscellany section of this talk on making conditional logic easier to read and maintain (links to a specific slide): speakerdeck.com/jennybc/code-s… %||% is key to some nice design patterns eg, using NULL as the default val of optional args.


This is awesome! And happy to see @thomasp85 #geneRativeart projects included in the list 🎨💻

Sorry for the shameless plug, but you might be interested in this: bmcmedicine.biomedcentral.com/articles/10.11…



Cristina repostou

Introducing PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP starts with EMR data (structured and NLP) and outputs a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). nature.com/articles/s4159…

rplenge's tweet image. Introducing PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP starts with EMR data (structured and NLP) and outputs a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).

nature.com/articles/s4159…

Cristina repostou

Interested in small molecule retention time prediction? Check out our new paper in @NatureComms introducing the SMRT dataset, containing the experimental RT of 80K molecules generated in @kadzuis lab at @scrippsresearch #metabolomics #MachineLearning nature.com/articles/s4146…


Cristina repostou

genome-wide analysis identifies molecular systems and 149 genetic loci associated with believing GWAS results are biologically meaningful


Loading...

Something went wrong.


Something went wrong.