Senior Software Developer / Bioinformatician (Genome Annotation and Analysis)
Post Details
Job Title
Senior Software Developer / Bioinformatician (Genome Annotation and Analysis)
Post Number
1004177
Closing Date
6 Jun 2022
Grade
SC5
Starting Salary
£41,193 - £50,448
Hours per week
37
Expected/Ideal Start Date
20 Jun 2022
Months Duration
36

Job Description

Main Purpose of the Job

Develop accurate and scalable computational pipelines and software tools for the annotation and analysis of complex eukaryotic genomes. With responsibility for leading the development of EI's genome annotation toolkit and supporting large scale annotation projects. This will involve collaborations with EI faculty and biologists and computational scientists from UK and international institutions. Develop methods that are robust to different genome characteristics, assembly contiguity and amount, quality, and types of data. Build pipelines to support pan-genome analysis and annotation of alternative haplotypes. Utilise modern workflow management and containerisation systems to automate complex workflows. Expand EI’s gene annotation pipelines to incorporate data from the latest sequencing technologies. Prepare manuscripts for submission and present results at internal and external meetings. Keep up to date with modern software engineering best practices, data analysis best practices and new technologies. The postholder will also have the opportunity to mentor junior bioinformaticians, interns and visitors.

Key Relationships

Reporting to David Swarbreck (Head of Core Bioinformatics) and interacting on a daily basis with members of the core bioinformatics group. Will interact with all EI staff and external collaborators but particularly the group leaders, domain experts and leads in relevant genomics research.

Main Activities & Responsibilities

Percentage
Designing and implementing new workflows within the framework of EIs genome annotation toolkit REAT (Robust Eukaryotic Annotation Toolkit, https://github.com/EI-CoreBioinformatics/reat)
20
Develop tools to project, consolidate, validate, and resolve differences in gene annotations across clades, subspecies, and haplotypes.
20
Develop approaches for integrating data from a variety of sequencing technologies and maintain and develop existing pipelines / annotation tools
15
Engage in research projects and analysis, developing bespoke tools and scripts.
15
Collaborate with EI faculty groups and UK and international partners on large-scale annotation and analysis projects.
15
Manage the release of data and tools to public repositories
5
Communicate the work of EI in oral (both in internal seminars and external conferences) and written presentations and to publish novel findings and technology/tool development in scientific journals.
5
Keep up to date with modern software engineering best practices, data analysis best practices and new technologies.
5
As agreed with line manager, any other duties commensurate with the nature of the post

Person Profile

Education & Qualifications

Requirement
Importance
PhD or equivalent experience in a relevant subject area (e.g. bioinformatics, computational biology, computer science)
Essential

Specialist Knowledge & Skills

Requirement
Importance
Excellent programming skills in various languages (ideally including Python, R, shell scripting) including the ability to interpret legacy code
Essential
Proficient skills with UNIX tools and version control (such as git, subversion)
Essential
Familiarity with cluster compute environments and workflow management systems
Essential
Understanding of testing frameworks and best practices
Essential
Knowledge of methods for genome annotation
Essential
Knowledge of pipeline workflow programming
Essential
Broad and in-depth understanding of bioinformatics tools
Essential

Relevant Experience

Requirement
Importance
Experience designing and implementing computational methods, tools and algorithms for large scale data analysis
Essential
Experience of workflow management systems e.g. snakemake, cromwell, nextflow
Essential
Experience developing software in either Python, C/C++ or Java
Essential
Have experience using High Performance Computing (HPC) resources and job scheduling systems
Essential
Experience with version control
Essential
Experience with next generation sequencing (RNA-Seq, DNA-Seq) and large-scale data analysis
Desirable
Experience of working in a high throughput genomics environment with disciplined delivery of high quality usable data sets
Desirable
Experience with genome annotation tools and approaches
Desirable

Interpersonal & Communication Skills

Requirement
Importance
Excellent leadership skills
Essential
Capable of working with domain specialists to gather system requirements
Essential
Strong organisational and record keeping skills
Essential
To be able to communicate clearly with EI staff at all levels
Essential
Able to work independently
Essential
Capable of writing academic papers and grants
Essential
Good communication skills, both written and verbal
Essential
Good interpersonal skills, with the ability to work well as part of a team
Essential
Networking and influencing skills and the ability to build effective collaborative links
Desirable

Additional Requirements

Requirement
Importance
Attention to detail
Essential
Promotes equality and values diversity
Essential
Willingness to participate in occasional training of other institute members and visitors, and as part of formal training programmes hosted at the institute
Essential
Willingness to embrace the expected values and behaviours of all staff at the Institute, ensuring it is a great place to work
Essential
Willingness to work outside standard working hours when required
Essential
Able to present a positive image of self and the Institute, promoting both the international reputation and public engagement aims of the Institute
Essential

Who We Are

Earlham Institute

Earlham Institute is a vibrant, contemporary research institute and registered charity, working in an area of rapid technological development and innovation.

Earlham Institute is strategically funded by the BBSRC to lead the development of a skill base in bioinformatics and a genomics technology platform for UK bioscience. The Institute is located on the Norwich Research Park, together with its partners: the John Innes Centre, the Institute of Food Research, The Sainsbury Laboratory, the University of East Anglia and the Norfolk and Norwich University Hospital. The research park has an excellent reputation for research in plant and microbial sciences, interdisciplinary environmental science and food, diet and health, to which Earlham Institute contributes strengths in genomics and bioinformatics. Close links exist between the NRP partners and new opportunities for collaboration in exciting new initiatives are under development. The NRP recently received £26M of government investment to facilitate innovation and further develop infrastructure to attract science and technology companies to the Park to enhance the vibrant environment and realise economic impact from research investment.

Earlham Institute is a UK hub for innovative Bioinformatics through research, analysis and interpretation of multiple, complex data sets. It hosts one of the largest computing hardware facilities dedicated to life science research in Europe. This has been boosted recently by an e-Infrastructure grant to expand the data storage capacity to a multi-petabyte unit, deploying a high performance cluster and large-memory server enabling the allocation of processes requiring several terabytes of computing memory.

Earlham Institute’s state of the art DNA sequencing facility operates multiple complementary technologies for data generation that provide the foundation for analyses furthering our fundamental understanding of genomes and how they function. We aim to be at the forefront of technological advances and are developing and implementing technologies to generate and analyse new types of data. We also develop novel platforms to provide access to computational tools and processing capacity for multiple academic and industrial users and promoting applications of computational bioscience. Earlham Institute has one fully owned subsidiary, Genome Enterprise Ltd (GEL) via which it offers genomic and bioinformatics services on a trading basis and works with commercial providers on a partnership basis. Earlham Institute also receives specific funding to enable knowledge exchange programmes which are supported across the institute teams.

Department

Digital Biology

Group Details

The Core Bioinformatics Group led by David Swarbreck provides high quality computational analyses of a range of sequence data types and develops software and analysis pipelines that support the Institute's Core Strategic Programmes and National Capability in Genomics and Single Cell Analysis (NCGS). The National Capability provides access to state of the art genomics technologies for research groups throughout the UK as well as delivering sequencing and genomics services to the associated Science Faculty groups and commercial customers. The group is involved in the analysis of sequence data from a wide variety of sequencing platforms and works alongside Earlham research groups to establish new software tools and analysis methods.

Management and Leadership

Requirement
Importance
Ability to lead projects to a successful conclusion
Essential
Coordinate work with others on collaborative projects
Essential

Living in Norfolk

Advertisement

Senior Software Developer / Bioinformatician (Genome Annotation and Analysis)

Applications are invited for a Senior Software Developer/Bioinformatician to join the Core Bioinformatics Group at the Earlham Institute (EI), based in Norwich, UK.

The role:

This post will focus on developing novel computational tools and pipelines for large scale annotation of genes and other genomic features across a diverse range of species including plants, mammals, fish and protists.

This role will have responsibility for leading the development of EI's genome annotation pipelines and supporting large scale annotation / analysis projects. A key focus will be continuing the development of REAT - EI's Robust and Extendable eukaryotic Annotation Toolkit (https://github.com/EI-CoreBioinformatics/reat), building new workflows to address specific challenges relevant to the annotation of protist and plant genomes including annotation of non-culturable protists using single cell data. 

The role will develop scalable and robust methods to annotate eukaryotic species with large, repeat rich and polyploid genomes, utilising data from cutting edge sequencing technologies including PacBio IsoSeq and Nanopore.

As part of the Core Bioinformatics Group you will:
• Design and develop accurate and scalable computational pipelines for the annotation and analysis of complex eukaryotic genomes
• Continue development of existing pipelines / annotation tools
• Develop tools to support pan-genome analysis and annotation of alternative haplotypes
• Develop approaches for integrating data from a variety of sequencing technologies
• With the core bioinformatics team produce high-quality, evidence-based genome annotation of protein-coding, lncRNA and pseudogenes
• Utilise modern workflow management and containerisation systems to automate complex workflows. 
• Keep up to date with modern software engineering best practices, data analysis best practices and new technologies.
• Mentor junior bioinformaticians, interns and visitors.
• Collaborate with EI faculty groups and UK and international partners on large-scale projects including DToL.


The ideal candidate:

The successful candidate will possess a PhD or equivalent experience in computational biology, bioinformatics, computer science or a similar subject. They will also have a strong background in developing computational tools / pipelines and experience of large scale data analysis.

Candidates should possess a high level of skill and demonstrable experience of building tools and pipelines in Python, along with R and Bash scripting. Familiarity with cluster compute environments and workflow management systems is also essential.

The ideal candidate will also be highly collaborative and have demonstrable experience of successfully managing their time to deliver analyses and input into multiple projects

Additional information:

Salary on appointment will be within the range £41,193 to £50,448 per annum depending on qualifications and experience.  A market supplement may also apply depending on skills and experience. This is a full-time post for a contract of 3 years.

As a Disability Confident employer, we guarantee to offer an interview to all disabled applicants who meet the essential criteria for this vacancy.

The closing date for applications will be 6 June 2022.