Data Engineer

Date: Mar 11, 2023

Location: Columbus, OH, US, 43202

Company: American Chemical Society

CAS uses intuitive technology, unparalleled scientific content and unmatched human expertise to help companies create groundbreaking innovations that benefit the world. As the scientific information solutions division of the American Chemical Society, CAS manages the largest curated reservoir of scientific knowledge, and for 116 years, has helped innovators mine, assess and apply that information to keep businesses thriving. The CAS team is global, diverse, endlessly curious and strives to make scientific insights accessible to innovators worldwide.


CAS is currently seeking a Java/Scala Developer to join one of the teams that build applications for our Big Data architecture and Data Lake – the Content Engine. This position will be located in our headquarters in Columbus, Ohio.


Technologies and Tools We Use to Build Solutions:

Hadoop, Scala, Apache Spark, Apache Kafka, Apache Flink, Apache Iceberg, Cascading, Solr, Docker, AWS (many services and technologies), OpenStack, Jenkins, Maven, Git, Java, Groovy, JavaScript, Python, Jetty, REST, Node, Eclipse, IntelliJ, SQL, Linux, Gerrit, Junit, Ruby, Cucumber, Jira, Confluence, MarkLogic and others that we haven’t discovered yet.


Job Responsibilities:

· Participate on the CORE Platform Team which is developing our next-generation content platform to support the evolving content-processing needs of CAS software applications and products.

· Develop features to support an immense-scale knowledge graph of many data domains. This includes the ability to define and maintain data schemas, capabilities to ingest content into the knowledge graph, and capabilities to create subscriptions that pull data from the knowledge graph.

· Develop and maintain the AWS-based CORE platform operational features (e.g. alerting, monitoring, administration, etc.) and CI/CD capabilities.

· Support functional and performance testing of the CORE platform including scalability testing to ensure CORE-based applications can meet their production SLAs.

· Provide consultation and assistance to application teams building CORE-based solutions. This may include working on those teams for short intervals as they integrate with CORE APIs.

· Collaborate with teammates and product owners to groom backlogs of epics and stories for upcoming development sprints.

· Connect across the organization in our Communities of Practice to build influential working relationships, preparing your career for tomorrow.

· Stay abreast of the latest technology trends through individual and team training opportunities.


Job Requirements:

· 4-year degree in computer science or engineering, or equivalent job experience.

· Additional degree or experience in Data Science is a plus.

· Substantial internship or work experience outside of college curriculum.

· Leadership: willingness to provide ownership for team deliverables and to adopt CAS Operating Principles related to trust, organizational growth, team building, and team performance.

· Pair Programming: candidates need to be willing to pair with others when it makes sense.

· Clear Communication and Healthy Dialogue: candidates must be comfortable and eager to discuss work items and issues in team settings.

· Passion for Development: preference for candidates that actively learn on the job and outside of work. We are a team of developers that constantly seek to improve our craft and we expect to work with the same.

· Cross Functional Skill Development: willing to learn new skills and roles to meet the needs of their team.


Highly Desirable Skills/Experience:

· Application Delivery and Software Development: some experience is desirable.

· Big Data: any experience with the big data technology stack (e.g. Hadoop, Spark, Scala, Kafka, Cascading, Solr, Flink, etc.) is highly desirable but not a requirement.

· AWS: some experience with AWS services like S3, EMR, EC2, Lambda and Cloudformation (and/or CDK) is highly desirable.

· Java and Scala/Spark Development: any full-stack experience creating enterprise applications with automated builds, automated deployments, shell-scripting, and operating domains using a

Java-based technology stack (e.g. Java, Scala, Git, Jenkins, OpenStack, Docker, Maven etc.) is desirable.

· Quality Engineering: fundamental experience and knowledge of Automated Testing, Test-driven Development, debugging, troubleshooting, and optimizing code/automation is desirable.

Chemical Abstracts Service (CAS), a division of the American Chemical Society, is the world’s authority for chemical information. CAS is the only organization in the world whose objective is to find, collect and organize all publicly disclosed chemical substance information. A team of scientists worldwide curates and controls the quality of our databases, which are recognized as the most comprehensive and authoritative by chemical and pharmaceutical companies, universities, government organizations and patent offices around the world. By combining these databases with advanced search and analysis technologies (SciFinder® and STN®), CAS delivers the most current, complete, secure and interlinked digital information environment for scientific discovery.


CAS offers a competitive salary and comprehensive benefits package, including a generous vacation plan, medical, dental, vision insurance plans, and employee savings and retirement plans.  Candidates for this position must be authorized to work in the United States and not require work authorization sponsorship by our company for this position now or in the future.  EEO/Minority/Female/Disabled/Veteran


Nearest Major Market: Columbus

Job Segment: Testing, Pharmaceutical, Developer, Java, Computer Science, Technology, Science