Dami's blog full of codes – Page 2 – …just a list of useful things I learned while working with R and python…

Analyzing Fluoresence Microscopy Data with R and the CellSignalingTools package

Damiano March 9, 2016 0 Comments

This is nice

Working with R and Bioconductor on the cloud (Amazon EC2)

Damiano March 4, 2016 0 Comments

Some Bioconductor-based projects may be computationally challenging and require a lot of resources. If a powerful workstation is not available, it may be a good idea to work with R and Bioconductor at scale using Amazon Web Services (AWS). Setting…
Read more

bioinformatics

Amazon EC2, AMI, bioconductor, cloud, R, SSH

Building and Publishing R packages on CRAN or GitHub

Damiano March 3, 2016 0 Comments

Building an R package is an easy and very convenient way keep your work well organized. Moreover, it facilitates sharing your code with the R community. Here we will discuss about publishing R packages on CRAN and GitHub. The post…
Read more

data science

CRAN, Description, GitHub, Namespace, package, package.skeleton, R

DNA sequence manipulation in R: getting the Reverse Complement of a DNA string

Damiano February 4, 2016 0 Comments

Manipulating DNA/RNA sequences is a very basic and fundamental operation in Molecular Biology. Writing the reverse-complement of a DNA sequence is very easy, but is also a error-prone operation if performed manually. Sequence manipulation tools are available online and free-of-charge…
Read more

bioinformatics

complement, nucleic acids, R, reverse, sequence manipulation

easyPubMed for business: scraping PubMed data in R for a targeting campaign

Damiano January 21, 2016 6 Comments

In this post, I will cover how to use easyPubMed (R Package) to retrieve data from PubMed. This example is focused on data extraction from PubMed records for a targeting campaign. The post is aimed at suggesting a business-oriented way…
Read more

data science, tutorials

businessPubMed, CRAN, easyPubMed, PubMed, R, regex, scraping

Querying PubMed via the easyPubMed package in R

Damiano January 5, 2016 36 Comments

PubMed (NCBI Entrez) is an online database of citations for biomedical literature that is available at the following URL: http://www.ncbi.nlm.nih.gov/pubmed. Retrieving data from PubMed is also possible in an automated way via the NCBI Entrez E-utilities. A description of how…
Read more

data science

PubMed, R, regex, XML

Scraping Impact Factor data from the Web using httr and regex in R

Damiano January 2, 2016 0 Comments

A couple of days ago, I found a website listing Impact Factor data of many scientific Journals organized in HTML tables (http://www.citefactor.org). Unfortunately, this website didn’t allow users to download Impact Factor tables in 1-click. Moreover, data were scattered over…
Read more

data science

HTML, httr, R, regex, scraping

Retrieving Expression Levels of all Members of a Gene Family (GO) from an Oncomine DataSet

Damiano October 29, 2015 0 Comments

This page describes a quick way to extract gene expression information for a specific group of genes (as defined in one or more GO terms) from a Oncomine DataSet. This is useful, for example, if we want to study a…
Read more

bioinformatics

bioconductor, Gene Ontology, GO.db, Oncomine, R, Volcano

Exploratory analysis of datasets obtained from GEO

Damiano October 22, 2015 2 Comments

Gene Expression Omnibus (GEO) is an online public repository of functional genomics data. Information about GEO may be found at the following URL: http://www.ncbi.nlm.nih.gov/geo/info/faq.html. Briefly, GEO includes different types of datasets: GEO Profiles are curated datasets obtained from GEO DataSets….
Read more

bioinformatics

bioconductor, colorhcplot, exploratory, GEO, GEOmetadb, GEOquery, pca, R

Colorful Hierarchical Clustering Dendrograms with R

Damiano September 30, 2015 0 Comments

Hierarchical clustering is a very effective method for exploratory data analysis and is aimed at building a hierarchy of clusters based on the similarity of the samples in a dataset. The idea behind hierarchical clustering is very intuitive. Let’s assume…
Read more

bioinformatics

clustering, exploratory, hclust, hierarchical, R

RPKM calculation and relative gene expression quantification

Damiano May 4, 2015 0 Comments

RNAseq data may provide an estimate of the relative expression level of different genes in a sample or in a cell type. It is sufficient to compare RPKM (reads per kilobase trascript per million reads) values of the genes of…
Read more

bioinformatics

bioconductor, R, RNA-seq, RPKM