Quantcast
Channel: YOKOFAKUN
Browsing latest articles
Browse All 105 View Live

XML+XSLT = #Makefile -based #workflows for #bioinformatics

I've recently read some conversations on Twitter about Makefile-based bioinformatics workflows. I've suggested on biostars.org (Standard simple format to describe a bioinformatics analysis pipeline)...

View Article



Divide-and-conquer in a #Makefile : recursivity and #parallelism.

This post is my notebook about implementing a divide-and-conquer strategy in GNU make.Say you have a list of 'N' VCFs files. You want to create a list of:common SNPs in vcf1 and vcf2 common SNPs in...

View Article

Compiling a C++ 'Hello world' program using the #NCBI C++ toolbox: my notebook.

This post is my notebook for compiling a simple C++ application using the NCBI C++ toolbox (http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/).This application prints 'Hello world' and takes two...

View Article

Filtering Fasta Sequences using the #NCBI C++ API. My notebook.

In my previous post (http://plindenbaum.blogspot.com/2015/01/compiling-c-hello-world-program-using.html) I've built a simple "Hello World" application using theNCBI C++ toolbox...

View Article

Listing the 'Subject' Sequences in a BLAST database using the NCBI C++...

In my previous post (http://plindenbaum.blogspot.com/2015/01/filtering-fasta-sequences-using-ncbi-c.html) I've built an application filtering FASTA sequences using theNCBI C++ toolbox...

View Article


Automatic code generation for @knime with XSLT: An example with two nodes:...

KNIME is a java+eclipse-based graphical workflow-manager. Biologists in my lab often use this tool to filter VCFs or other tabular data. A software Development kit (SDK) is provided to build new...

View Article

Drawing a Manhattan plot in SVG using a GWAS+XML model.

On friday, I saw my colleague @b_l_k starting writing SVG+XML code to draw a Manhattan plot. I told him that a better idea would be to describe the data using XML and to transform the XML to SVG using...

View Article

Integrating a java program in #usegalaxy.

This is my notebook for the integration of java programs in https://usegalaxy.org/ . create a directory for your tools under ${galaxy-root}/tools mkdir ${galaxy-root}/tools/jvarkit put all the required...

View Article


Playing with hadoop/mapreduce and htsjdk/VCF : my notebook.

The aim of this test is to get a count of each type of variant/genotypes in a VCF file using Apache Hadoop and the java library for NGS htsjdk. My source code is available at:...

View Article


Monitoring a java application with mbeans. An example with samtools/htsjdk.

"A MBean is a Java object that follows the JMX specification. A MBean can represent a device, an application, or any resource that needs to be managed. The JConsole graphical user interface is a...

View Article

Playing with the #GA4GH schemas and #Avro : my notebook

After watching David Haussler's talk "Beacon Project and Data Sharing ApIs", I wanted to play with Avro and the models and APIs defined by the Global Alliance for Genomics and Health (ga4gh) coalition...

View Article

A BLAST to SAM converter.

Some times ago, I've received a set of Ion-Torrent /mate-reads with a poor quality. I wasn't able to align much things using bwa. I've always wondered if I could get better alignments using NCBI-BLASTN...

View Article

Playing with #Docker , my notebook

This post is my notebook about docker after we had a very nice introduction about docker by François Moreews (INRIA/IRISA, Rennes). I've used docker today for the first time, my aim was just to create...

View Article


GATK-UI : a java-swing interface for the Genome Analysis Toolkit.

I've just pushed GATK-UI, a java swing interface for the Genome Analysis Toolkit GATK at https://github.com/lindenb/gatk-ui. This tool is also available as a WebStart/JNLP application. Screenshot Why...

View Article

Happy birthday my blog. You are now ten-year-old.

Happy birthday my blog. You are now 10-year-old.

View Article


Registering a tool in the @ELIXIREurope regisry using XML, XSLT, JSON and...

The Elixir Registry / pmid:26538599 "A portal to bioinformatics resources world-wide. With community support, the registry can become a standard for dissemination of information about bioinformatics...

View Article

Reading a VCF file faster with java 8, htsjdk and java.util.stream.Stream

java 8 streams "support functional-style operations on streams of elements, such as map-reduce transformations on collections". In this post, I will show how I've implemented a java.util.stream.Stream...

View Article


Now in picard: two javascript-based tools filtering BAM and VCF files.

SamJS and VCFFilterJS are two tools I wrote for jvarkit. Both tools use the embedded java javascript engine to filter BAM or VCF file. To get a broader audience, I've copied those functionalities to...

View Article

finding new intron-exon junctions using the public Encode RNASeq data

I've been asked to look for some new / suspected / previously uncharacterized intron-exon junctions in public RNASeq data. I've used the BAMs under...

View Article

Playing with the @ORCID_Org / @ncbi_pubmed graph. My notebook.

"ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports...

View Article

pubmed: extracting the 1st authors' gender and location who published in the...

In this post I'll get some statistics about the 1st authors in the "Bioinformatics" journal from pubmed. I'll extract their genders and locations. I'll use some tools I've already described some years...

View Article


Playing with #magicblast, the #NCBI Short read mapper. My notebook

NCBI MAGIC Blast was recently mentioned by BioMickWatson on twitter. Looks pretty cool. Perhaps once again the answer to all bfx questions will be BLAST RE https://t.co/4D5e9QQnrb...

View Article


Writing a Custom ReadFilter for the GATK, my notebook.

The GATK contains a set of predefined read filters that "filter or transfer incoming SAM/BAM data files":BadCigar BadMate CountingRead DuplicateRead FailsVendorQualityCheck LibraryRead MalformedRead...

View Article

Hello WDL ( Workflow Description Language )

This is a quick note about my first WDL workflow (Workflow Description Language) https://software.broadinstitute.org/wdl/. As a Makefile, my workflow would be the following one: NAME?=world...

View Article

Creating a custom GATK Walker (GATK 3.6) : my notebook

This is my notebook for creating a custom engine in GATK. Description I want to read a VCF file and to get a table of category/count. Something like this: HAVE_ID TYPE COUNT YES SNP 123 NO SNP 3 NO...

View Article

Browsing latest articles
Browse All 105 View Live




Latest Images