jobdata

A Two-Step Approach to Precision Job Filtering

This tutorial presents a two-step method for optimizing API queries within the Life Sciences and Biotechnology industries, to ensure highly relevant job listings are identified more efficiently and accurately.

5 min read · March 7, 2025
Table of contents

Fetching the right jobs form the API in specialized industries like Life Sciences and Biotechnology can be challenging. To streamline this process, we can leverage a two-step approach by filtering for title keywords first and then verify with description filters locally. Here we don't even need to use any full-text search capabilities from the API.

In this article, we'll walk through a Python script that fetches job listings based on title keywords and filters them further using highly relevant keywords from job descriptions.

Overview of the Method

Our approach involves two main steps:

  1. API Query with Title Keywords:
    We fetch job listings that match a set of predefined title keywords. These keywords are combined using the |OR| operator to ensure we capture a broad range of relevant job titles (you can combine up to 50 different keywords).

  2. Local Filtering with Description Keywords:
    Once we have the raw job listings, we filter them locally to ensure each job description contains at least two or more highly relevant keywords. This ensures that the jobs are not only relevant by title but also by their actual content.

By combining these two steps, we can efficiently narrow down job listings to those that are most relevant to the Life Sciences and Biotechnology industries.

The Python Script

Below is the Python script that implements this method. We'll break it down into its key components and explain each part in detail.

Step 1: Define Keyword Lists

We start by defining two lists of keywords:

  1. Title Keywords:
    These are substrings that are likely to appear in job titles within the Life Sciences and Biotechnology industries.
title_keywords = [
    "Biotech", "Life Sciences", "Biopharma", "Pharmaceutical", "Genomics", 
    "Proteomics", "Bioinformatics", "Molecular Biology", "Cell Biology", 
    "Biochemistry", "Clinical Research", "Regulatory Affairs", "Quality Assurance", 
    "Biomanufacturing", "Biostatistics", "Pharmacology", "Toxicology", 
    "Biomedical Engineer", "Scientist", "Research Associate", "Lab Technician"
]
  1. Description Keywords:
    These are highly relevant keywords that typically appear in job descriptions for positions in these industries.
description_keywords = [
    "PCR", "ELISA", "CRISPR", "NGS", "RNA", "DNA", "cell culture", "flow cytometry", 
    "mass spectrometry", "HPLC", "LC-MS", "GMP", "GLP", "FDA", "ICH", "clinical trials", 
    "drug development", "bioprocessing", "fermentation", "protein purification", 
    "assay development", "statistical modeling", "bioanalytical", "pharmacokinetics", 
    "pharmacodynamics", "toxicology studies", "regulatory submissions"
]

Step 2: Fetch Jobs Using the API

We define a function fetch_jobs_by_title to fetch job listings from the API based on the title keywords. The keywords are combined using the |OR| operator to create a single query string.

import requests

API_URL = "https://jobdataapi.com/api/jobs/"
API_KEY = "your_api_key_here"  # Replace with your actual API key

def fetch_jobs_by_title(title_keywords):
    """
    Fetches job listings from the API based on title keywords using the |OR| operator.

    :param title_keywords: List of keyword substrings for job title search
    :return: List of job listings (each listing is a dictionary with 'title' and 'description' keys)
    """
    # Join up to 50 title keywords with |OR| for the API query
    title_query = "|OR|".join(title_keywords)

    # Make the API request
    params = {
        "title": title_query,
        "description_md": True,  # get Markdown version of job description
        "description_off": True,  # switch off HTML version
        "exclude_expired": True,  # only open positions
        "max_age": 90,  # only jobs published in the past 90 days
        "api_key": API_KEY
    }
    response = requests.get(API_URL, params=params)

    # Check if the request was successful
    if response.status_code == 200:
        res_json = response.json()
        print(f"Found {res_json['count']} jobs based on title keywords.")
        return res_json.get("results", [])
    else:
        print(f"Error fetching jobs: {response.status_code}")
        return []

Note: If you're not on an access pro subscription you can remove the exclude_expired parameter from the list above to make it work without one.

Step 3: Filter Jobs Locally

Next, we define a function filter_jobs to filter the fetched job listings based on the description keywords. Each job description is checked to ensure it contains at least three or more of the highly relevant keywords.

def filter_jobs(job_listings, description_keywords, min_keywords=3):
    """
    Filters job listings to ensure each job description contains at least a minimum number
    of highly relevant keywords.

    :param job_listings: List of job listings (each listing is a dictionary with 'title' and 'description' keys)
    :param description_keywords: List of highly relevant keywords to check in job descriptions
    :param min_keywords: Minimum number of description keywords required for a job to be considered relevant
    :return: Filtered list of job listings
    """
    filtered_jobs = []

    for job in job_listings:
        description = job["description_md"].lower()

        # Count how many description keywords are in the job description
        keyword_count = sum(keyword.lower() in description for keyword in description_keywords)

        # If the job description contains at least min_keywords, add it to the filtered list
        if keyword_count >= min_keywords:
            filtered_jobs.append(job)

    return filtered_jobs

Step 4: Putting It All Together

Finally, we combine the two functions to fetch and filter job listings. Here's the complete script:

if __name__ == "__main__":
    # Step 1: Fetch jobs using the title keywords
    job_listings = fetch_jobs_by_title(title_keywords)
    print(f"Fetched {len(job_listings)} jobs based on title keywords.")

    # Step 2: Filter jobs based on description keywords
    filtered_jobs = filter_jobs(job_listings, description_keywords, min_keywords=2)
    print(f"Filtered down to {len(filtered_jobs)} highly relevant jobs.")

    # Print the filtered jobs (for demonstration purposes)
    for job in filtered_jobs:
        print(f"Title: {job['title']}")
        print(f"Description: {job['description_md'][:100]}...")  # Print first 100 chars of Markdown description
        print(f"URL: {job['application_url']}")
        print("-" * 50)

Example Output

When you run the script, you'll see output similar to the following:

Found 35572 jobs based on title keywords.
Fetched 100 jobs based on title keywords.
Filtered down to 59 highly relevant jobs.
Title: Immunology/Inflammation Expert (Postdoctoral Fellow / Scientist)
Description: **Position Summary:**

We are seeking a
highly motivated Immunology/Inflammation Expert to join our ...
URL: https://f.zohorecruit.com/jobs/Careers/3...
--------------------------------------------------
Title: Data Scientist
Description: **Knowledge,
Skills, Competencies and Responsibilities: - Technical
Competency: -**

* Play a key ro...
URL: https://v.zohorecruit.com/jobs/Careers/5...
--------------------------------------------------

Conclusion

This script is highly customizable - you can adjust the keyword lists, modify the filtering criteria, or integrate it with other tools to further enhance its functionality. The goal here is to demonstrate the quite simple yet effective concept to get a large number relevant listings very quickly.

The fastest way to create initial keyword and description filter lists is to ask you favorite LLM to generate it for you. After that you may review the results and modify or enhance them based on your experience or industry knowledge.

Related Docs

Converting Annual FTE Salary to Monthly, Weekly, Daily, and Hourly Rates
Integrating the jobdata API with Zapier
Fetching and Maintaining Fresh Job Listings
Merging Job Listings from Multiple Company Entries
Retrieving and Working with Industry Data for Imported Jobs
Integrating the jobdata API with n8n
Integrating the jobdata API with Excel
Using the jobdata API for Machine Learning with Cleaned Job Descriptions
Integrating the jobdata API with Make
Optimizing API Requests: A Guide to Efficient jobdata API Usage
How to Determine if a Job Post Requires Security Clearance