0

I'm following the Snakemake tutorial from what I gather it's the official documentation. After having downloaded the tools in the setup phase I am stuck in understunding the wildcard use in step 2, obtaining the error message.

MissingInputException in rule bwa_map in file "/home/user/SnakeTest/SnakeTest/Snakefile", line 1:
Missing input files for rule bwa_map:
    output: mapped_reads/{sample}.bam
    wildcards: sample={sample}
    affected files:
        data/samples/{sample}.fastq

I suspected that I was missing some file, but everything is where it should be and properly named. This is what the whole snakefile looks like; it's got two variables defined globally because this kicks in afterwards, but it doesn't change my tests.

My actual guess is that I am misunderstanding the use of wildcards. I gathered that you don't need to define what Snakemake will replace in the wildcard, because it will infer it from the kind of inputs available following the pattern provided by the name of the output directive. Here the patterns match: data/samples/{sample}.fastq reflects the three data/samples/A.fastq data/samples/B.fastq data/samples/C.fastq, but Snakemake begs to disagree.

I don't get what I am misunderstandng, can someone please shed a light on it? Thanks in advance.

5
  • You are right — this is not the expected behavior. In your case, Snakemake appears to be treating "{sample}" literally, not as a wildcard, so it is looking for the file "data/samples/{sample}.fastq". I’m not sure why this happens. When I run the full Snakefile you provided, I get the expected error ("Missing input files for rule bwa_map: data/samples/A.fastq", because I don’t have these files in the working directory) Which version of Snakemake are you running? Commented Oct 30 at 9:43
  • Glad I am not hallucinating. My snakemake is v9.13.4. Commented Oct 30 at 11:58
  • I just tried it again with v9.13.4. The folder contains nothing except one Snakefile. When I run snakemake -n, I get the expected output: MissingInputException in rule bwa_map ... affected files: data/samples/A.fastq not data/samples/{sample}.fastq Commented Oct 30 at 13:16
  • The Snakefile is the same I use, I can't tell why it keeps seeing {samples} in place of any of the three files. Side question: why in your case it has focused on A.fastq? Commented Oct 30 at 15:46
  • Snakemake sees what you want (the input of the first rule rule all) and works its way backwards, how to get it, building the dag. For A.fastq it does not find a file or a rule which produces it so it reports an error. In this case A.fastq is just the first one snakemake tries. If you would change the order of the SAMPLES-List or would provide data/samples/A.fastq and data/genome.fa, snakemake would give the same error for B.fastq Commented Oct 30 at 16:37

1 Answer 1

0

Not really an answer, but might be helpful to troubleshoot. What do you get in a new dict, with nothing in it except a minimal snakefile with this content:

SAMPLES = ["C","D"]


rule test:
    input:
        expand("mapped_reads/{sample}.bam", sample=SAMPLES)


rule bwa_map:
    input:
        "data/genome.fa",
        "data/samples/{sample}.fastq"
    output:
        "mapped_reads/{sample}.bam"
    shell:
        "bwa mem {input} | samtools view -Sb - > {output}"

and in the folder your run snakemake -n

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.