XML+XSLT = #Makefile -based #workflows for #bioinformatics
I've recently read some conversations on Twitter about Makefile-based bioinformatics workflows. I've suggested on biostars.org (Standard simple format to describe a bioinformatics analysis pipeline)...
View ArticleDivide-and-conquer in a #Makefile : recursivity and #parallelism.
This post is my notebook about implementing a divide-and-conquer strategy in GNU make.Say you have a list of 'N' VCFs files. You want to create a list of:common SNPs in vcf1 and vcf2 common SNPs in...
View ArticleCompiling a C++ 'Hello world' program using the #NCBI C++ toolbox: my notebook.
This post is my notebook for compiling a simple C++ application using the NCBI C++ toolbox (http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/).This application prints 'Hello world' and takes two...
View ArticleFiltering Fasta Sequences using the #NCBI C++ API. My notebook.
In my previous post (http://plindenbaum.blogspot.com/2015/01/compiling-c-hello-world-program-using.html) I've built a simple "Hello World" application using theNCBI C++ toolbox...
View ArticleListing the 'Subject' Sequences in a BLAST database using the NCBI C++...
In my previous post (http://plindenbaum.blogspot.com/2015/01/filtering-fasta-sequences-using-ncbi-c.html) I've built an application filtering FASTA sequences using theNCBI C++ toolbox...
View ArticleAutomatic code generation for @knime with XSLT: An example with two nodes:...
KNIME is a java+eclipse-based graphical workflow-manager. Biologists in my lab often use this tool to filter VCFs or other tabular data. A software Development kit (SDK) is provided to build new...
View ArticleDrawing a Manhattan plot in SVG using a GWAS+XML model.
On friday, I saw my colleague @b_l_k starting writing SVG+XML code to draw a Manhattan plot. I told him that a better idea would be to describe the data using XML and to transform the XML to SVG using...
View ArticleIntegrating a java program in #usegalaxy.
This is my notebook for the integration of java programs in https://usegalaxy.org/ . create a directory for your tools under ${galaxy-root}/tools mkdir ${galaxy-root}/tools/jvarkit put all the required...
View ArticlePlaying with hadoop/mapreduce and htsjdk/VCF : my notebook.
The aim of this test is to get a count of each type of variant/genotypes in a VCF file using Apache Hadoop and the java library for NGS htsjdk. My source code is available at:...
View ArticleMonitoring a java application with mbeans. An example with samtools/htsjdk.
"A MBean is a Java object that follows the JMX specification. A MBean can represent a device, an application, or any resource that needs to be managed. The JConsole graphical user interface is a...
View ArticlePlaying with the #GA4GH schemas and #Avro : my notebook
After watching David Haussler's talk "Beacon Project and Data Sharing ApIs", I wanted to play with Avro and the models and APIs defined by the Global Alliance for Genomics and Health (ga4gh) coalition...
View ArticleA BLAST to SAM converter.
Some times ago, I've received a set of Ion-Torrent /mate-reads with a poor quality. I wasn't able to align much things using bwa. I've always wondered if I could get better alignments using NCBI-BLASTN...
View ArticlePlaying with #Docker , my notebook
This post is my notebook about docker after we had a very nice introduction about docker by François Moreews (INRIA/IRISA, Rennes). I've used docker today for the first time, my aim was just to create...
View ArticleGATK-UI : a java-swing interface for the Genome Analysis Toolkit.
I've just pushed GATK-UI, a java swing interface for the Genome Analysis Toolkit GATK at https://github.com/lindenb/gatk-ui. This tool is also available as a WebStart/JNLP application. Screenshot Why...
View ArticleHappy birthday my blog. You are now ten-year-old.
Happy birthday my blog. You are now 10-year-old.
View ArticleRegistering a tool in the @ELIXIREurope regisry using XML, XSLT, JSON and...
The Elixir Registry / pmid:26538599 "A portal to bioinformatics resources world-wide. With community support, the registry can become a standard for dissemination of information about bioinformatics...
View ArticleReading a VCF file faster with java 8, htsjdk and java.util.stream.Stream
java 8 streams "support functional-style operations on streams of elements, such as map-reduce transformations on collections". In this post, I will show how I've implemented a java.util.stream.Stream...
View ArticleNow in picard: two javascript-based tools filtering BAM and VCF files.
SamJS and VCFFilterJS are two tools I wrote for jvarkit. Both tools use the embedded java javascript engine to filter BAM or VCF file. To get a broader audience, I've copied those functionalities to...
View Articlefinding new intron-exon junctions using the public Encode RNASeq data
I've been asked to look for some new / suspected / previously uncharacterized intron-exon junctions in public RNASeq data. I've used the BAMs under...
View ArticlePlaying with the @ORCID_Org / @ncbi_pubmed graph. My notebook.
"ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports...
View Articlepubmed: extracting the 1st authors' gender and location who published in the...
In this post I'll get some statistics about the 1st authors in the "Bioinformatics" journal from pubmed. I'll extract their genders and locations. I'll use some tools I've already described some years...
View ArticlePlaying with #magicblast, the #NCBI Short read mapper. My notebook
NCBI MAGIC Blast was recently mentioned by BioMickWatson on twitter. Looks pretty cool. Perhaps once again the answer to all bfx questions will be BLAST RE https://t.co/4D5e9QQnrb...
View ArticleWriting a Custom ReadFilter for the GATK, my notebook.
The GATK contains a set of predefined read filters that "filter or transfer incoming SAM/BAM data files":BadCigar BadMate CountingRead DuplicateRead FailsVendorQualityCheck LibraryRead MalformedRead...
View ArticleHello WDL ( Workflow Description Language )
This is a quick note about my first WDL workflow (Workflow Description Language) https://software.broadinstitute.org/wdl/. As a Makefile, my workflow would be the following one: NAME?=world...
View ArticleCreating a custom GATK Walker (GATK 3.6) : my notebook
This is my notebook for creating a custom engine in GATK. Description I want to read a VCF file and to get a table of category/count. Something like this: HAVE_ID TYPE COUNT YES SNP 123 NO SNP 3 NO...
View Article
More Pages to Explore .....