NCBI我想大家是再熟悉不过的了,有时候我们需要按照自己的想法来下载一些序列在比对一下,网上也有很多点来点去的教程,但是面对数据量大一点的序列他们往往不太灵。
比如我们想下物种Phytophthora 的ITS1序列,在NCBI上输入Phytophthora ITS1 就可以了。然后把Search details里面的内容填到脚本相应的内容里面就好啦。
下载序列其实是一道填空题。
#!/usr/bin/perl #use strict;######################################################## Application 3: Retrieving large datasets # # Goal: Download all chimpanzee mRNA sequences in FASTA format (>50,000 sequences).# # Solution: First use ESearch to retrieve the GI numbers for these sequences and post them on the History server, then use multiple EFetch calls to retrieve the data in batches of 500.# # Input: $query – chimpanzee[orgn]+AND+biomol+mrna[prop]# # Output: A file named "chimp.fna" containing FASTA data.# #######################################################use LWP::Simple; my ($query,$base,$url,$output,$web,$key,$count,$retmax);#$query = 'chimpanzee[orgn]+AND+biomol+mrna[prop]';$query = 'Phytophthora[Organism] AND ITS1[All Fields]';#assemble the esearch URL$base = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/";$url = $base."esearch.fcgi?db=nucleotide&term=$query&usehistory=y";print "$url","\n"; #post the esearch URL$output = get($url); die "Couldn't get it!" unless defined $output; #parse WebEnv, QueryKey and Count (# records retrieved)$web = $1 if ($output =~ /<WebEnv>(\S+)<\/WebEnv>/);$key = $1 if ($output =~ /<QueryKey>(\d+)<\/QueryKey>/);$count = $1 if ($output =~ /<Count>(\d+)<\/Count>/); #open output file for writingopen(OUT, ">Phytophthora_its.fna") || die "Can't open file!\n"; #retrieve data in batches of 500$retmax = 500;for (my $retstart = 0; $retstart < $count; $retstart += $retmax) { my $efetch_url = $base ."efetch.fcgi?db=nucleotide&WebEnv=$web"; $efetch_url .= "&query_key=$key&retstart=$retstart"; $efetch_url .= "&retmax=$retmax&rettype=fasta&retmode=text"; my $efetch_out = get($efetch_url); print OUT "$efetch_out"; } close OUT;
这是之前在下载数据的时候遇到一个脚本,觉得挺方便的,在这里分享给大家,但是不知道作者谁了,该作
作者:周运来就是我
链接:https://www.jianshu.com/p/c9b8270ed86b
点击查看更多内容
为 TA 点赞
评论
共同学习,写下你的评论
评论加载中...
作者其他优质文章
正在加载中
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦