Senior Data Engineers

Job Number:
644345_Data Engineer (Bolling AFB)
Job Category:
Scientist / Research & Development
Bolling, MD US
Yes, 10% of the time
Day Job
Potential for Teleworking:
Clearance Level Must Currently Possess:
Top Secret/SCI
Clearance Level Must Be Able to Obtain:
The Advanced Solutions Group (ASG) at Leidos currently has an opening for a Senior Data Engineer in our Arlington, VA office or Bolling AFB office. The successful candidate will establish a data engineering pipeline process that will support complex and diverse software systems in processing-intensive data enrichment and analytics of extremely large data sets in an agile and fast paced development environment supporting research, rapid prototyping, and transition to production.  The data engineering expertise will be applied to establish a data analytic pipeline in support of intelligence community analysts whose mission is to solve unique and challenging problems with Publicly Available Information (PAI) and Document and Media Exploitation (DOMEX) data. 

Roles and Responsibilities:
-Work closely with software developers, network engineers, senior investigators, researchers, data scientists on agile teams to design and optimize Data Science platforms in support of Leidos internal and DoD/IC customer requirements.  
- Lead the software development and integration of these software components to optimize system design and performance.
- Lead the design, implementation, and integration of software applications utilizing state of the art agile processes to deliver leading edge analytic solutions.  
- Demonstrate aptitude for problem solving, identifying, transforming, and exploiting new publicly available information (PAI) data sources, thinking outside the box, and a strong sense of accountability is desired 
- Mix of technical excellence, intellectual curiosity, customer-focus, and operational experience to improve the performance and user adoption of high-end data analytics platforms in partnership with a highly qualified, highly motivated team. 
- The successful data engineer will be part of a multi-disciplinary research, development, and fusion analytics team and have the unique opportunity to interact directly with research scientists, analysts, and other technologists to develop data collection and exploitation pipeline applications and tools to meet mission requirements. 
- Individual must be motivated, self-driven team player who can multi-task and interact well with others and advise/consult with other team members on systems engineering and software development related issues.
- Expertise in analyzing designing, building, testing, and deploying capabilities and services to meet data needs.
- Experience with defining and recording data requirements and delivering data requirements
- Experience with developing and maintaining conceptual data models and delivering logical data models, physical data models, and physical databases
- Experience in designing data integration services and delivering source-to-target-maps, data extract-transform-load (ETL) design specifications, and data conversion designs
- Experience in writing software code and scripts to distribute the processing of information extraction tasks to identify entities, events and relationships from large corpus of structured and unstructured data and media stored in a distributed file system.  

To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below
- B.S. in Computer Science, Information Science or related field, and at least 6 total years’ experience 
- Must have an active TS/SCI security clearance 
- 6 years’ experience in scripting (Python) 
- 6 years (concurrent) experience with high level programming language: C#, C++, Python, or Java 
- Experience developing and implementing algorithms using common distributed processing frameworks: MapReduce, Bulk Synchronous Parallel, Online Analytic Graph Processing, Interactive AQL processing
- Experience with importing and exporting data between an external RDBMS and a Hadoop cluster including the ability to import subsets, change the delimiter and file format of imported data during ingest
- Expertise in loading data into and out of the Hadoop File System (HDFS) using the HDFS command line interface
- Experience in just-in-time indexing of large metadata and text repositories using Lucene-based systems (SOLR and Elasticsearch)
- Expertise in constructing applications on top of distributed computing clusters, Hadoop, Elasticsearch
- Experience in Commercial and/or Government Cloud environments (AWS, C2S, etc.)

Preferred Qualifications:
Candidates with these desired skills will be given preferential consideration
- Experience with Graph databases 
- Experience in IC analysis tools (e.g., iBase/Analyst’s Notebook, Palantir, etc.) 
- Experience in Geospatial data and the development of geospatially based analytical models 
- Experience in Text and Network processing: natural language processing (NLP), searching (e.g., Lucene), topic extraction, summarization, clustering, etc. 

External Referral Eligible

Leidos Overview:
Leidos is a global science and technology solutions leader working to solve the world’s toughest challenges in the defense, intelligence, homeland security, civil, and health markets. The company’s 33,000 employees support vital missions for government and commercial customers. Headquartered in Reston, Virginia, Leidos reported pro forma annual revenues of approximately $10 billion for the fiscal year ended January 1, 2016 after giving effect to the recently completed combination of Leidos with Lockheed Martin's Information Systems & Global Solutions business (IS&GS). For more information, visit www.Leidos.com. The company’s diverse employees support vital missions for government and commercial customers. Qualified women, minorities, individuals with disabilities and protected veterans are encouraged to apply. Leidos will consider qualified applicants with criminal histories for employment in accordance with relevant Laws. Leidos is an Equal Opportunity Employer.
Other Locations:  
Link for schema