# Fetching and Maintaining Fresh Job Listings

Efficiently fetch and maintain fresh job listings using the jobdata API by retrieving recent jobs and checking for expired listings daily, ensuring a clean and relevant local job database.

---

In this tutorial, we will explore a strategy to fetch the latest job listings from the jobdata API while ensuring that we keep our data fresh and up-to-date. We will utilize the `/api/jobs/` endpoint (see [docs](/c/jobs-api-endpoint-documentation/)) to retrieve recent job listings and the `/api/jobsexpired/` endpoint (see [docs](/c/jobs-expired-api-endpoint-documentation/)) to check for expired jobs. This approach will help you maintain a clean and relevant job database.

## Background and Design Rationale

The process described here is designed to ensure efficient and cost-effective job listing management by separating active job retrieval and expiration tracking into two endpoints. The `/api/jobs/` endpoint provides detailed, (non-expired with the `exclude_expird=true` flag) job data for initial imports, while the `/api/jobsexpired/` endpoint offers a lightweight feed of recently expired jobs, including their expiration dates.

This two-step process optimizes resource usage by avoiding repeated fetching of full job details and focusing only on updating expiration statuses in your local database. By periodically checking the `/api/jobsexpired/` feed and matching it against your database, you can maintain a fresh and accurate dataset with minimal API calls.

## Prerequisites

Before we begin, ensure you have the following:

1. **API Key**: You need a valid API key from jobdata API with an active **access pro** subscription. If you don't have one, you can generate it from your dashboard after [subscribing to the plan](/accounts/pricing/).
2. **Python Environment**: Make sure you have Python installed along with the `requests` library. You can install it using pip if you haven't already:

```bash
pip install requests
```

## Step 1: Fetch Recent Job Listings

We will start by fetching job listings that are not expired and have been published within the last 30 days. We will use the `exclude_expired=true` and `max_age=30` parameters in our request to the [`/api/jobs/` endpoint](/c/jobs-api-endpoint-documentation/).

### Example Code to Fetch Jobs with Pagination

```python
import requests
from datetime import datetime, timedelta

# Constants
API_KEY = "YOUR_API_KEY"  # Replace with your actual API key
JOBS_URL = "https://jobdataapi.com/api/jobs/"
EXPIRED_URL = "https://jobdataapi.com/api/jobsexpired/"
FETCH_DATE = datetime.now()  # Current date for fetching jobs

# Function to fetch recent job listings with pagination
def fetch_recent_jobs():
    headers = {"Authorization": f"Api-Key {API_KEY}"}
    params = {
        "exclude_expired": "true",
        "max_age": 30,
        "page_size": 1000,  # Fetch 1000 jobs per request
        "page": 1  # Start from the first page
    }
    
    all_jobs = []  # List to store all fetched jobs
    while True:
        response = requests.get(JOBS_URL, headers=headers, params=params)
        if response.status_code == 200:
            job_data = response.json()
            all_jobs.extend(job_data['results'])  # Add fetched jobs to the list
            # ...or directly import jobs into your local DB here...
            print(f"Fetched {len(job_data['results'])} jobs from page {params['page']}.")
            # Check if there is a next page
            if job_data['next']:
                params['page'] += 1  # Increment page number for the next request
            else:
                break  # Exit loop if no more pages
        else:
            print("Error fetching jobs:", response.status_code, response.text)
            break

    return all_jobs

# Fetch jobs
recent_jobs = fetch_recent_jobs()
print(f"Total fetched job listings: {len(recent_jobs)}")
```

### Explanation

- We set the `exclude_expired` parameter to `true` to filter out expired jobs.
- The `max_age` parameter is set to `30` to only fetch jobs published in the last 30 days.
- We specify `page_size` as `1000` to minimize the number of requests.
- The `while` loop handles pagination, fetching all available job listings sequentially.

## Step 2: Check for Expired Jobs

Next, we will check for expired jobs daily. We will use the [`/api/jobsexpired/` endpoint](/c/jobs-expired-api-endpoint-documentation/) to retrieve jobs that have expired since the last fetch date minus one day. This buffer ensures we don't miss any jobs that may have expired just before our check.

### Example Code to Check Expired Jobs

```python
# Function to check for expired jobs
def check_expired_jobs():
    headers = {"Authorization": f"Api-Key {API_KEY}"}
    expired_since = (FETCH_DATE - timedelta(days=1)).strftime('%Y-%m-%d')  # One day buffer
    params = {
        "expired_since": expired_since,
        "page_size": 1000,  # Fetch 1000 expired jobs per request
        "page": 1  # Start from the first page
    }
    
    expired_job_items = []  # List to store all expired job items
    while True:
        response = requests.get(EXPIRED_URL, headers=headers, params=params)
        if response.status_code == 200:
            expired_data = response.json()
            print(f"Fetched {len(expired_data['results'])} expired job items from page {params['page']}.")
            expired_job_items.extend(expired_data['results'])  # Add expired jobs to the list
            # ...or directly update jobs by their ID in your local DB here...
            # Check if there is a next page
            if expired_data['next']:
                params['page'] = params.get('page', 1) + 1  # Increment page number
            else:
                break
        else:
            print("Error fetching expired jobs:", response.status_code, response.text)
            break

    return expired_job_items

# Check for expired jobs
expired_jobs = check_expired_jobs()
```

### Explanation

- We calculate the `expired_since` date by subtracting one day from the current fetch date.
- We use a `while` loop to paginate through the results, fetching up to 1000 expired jobs at a time.
- The `next` link in the response helps us determine if there are more pages to fetch.

## Step 3: Update Your Database

After (or while) fetching the expired jobs, you should update your local database to remove or mark these jobs as expired. You can match the job IDs from the expired jobs response against your local database entries.

### Example Code to Update Database

```python
# Example function to update the database (pseudo-code)
def update_database(expired_jobs):
    for job in expired_jobs:
        # Pseudo-code for database update
        print(f"Updating database for expired job ID: {job['id']}")
        # db.update_job_status(job['id'], status='expired')

# Call the update function with the expired job IDs
update_database(expired_jobs)
```

### Explanation

- The `update_database` function is a placeholder for your actual database update logic. You would replace the print statement with your database update code that matches jobs by IDs existing in your database and sets their expired date accordingly.

## Conclusion

By following this tutorial, you have learned how to fetch recent job listings from the jobdata API while ensuring that you keep your data fresh by checking for expired jobs daily. This approach allows you to maintain a clean and relevant job database, enhancing the user experience for job seekers.

### Important Notes

- Ensure that you do not make parallel requests to the API; instead, make requests sequentially to avoid hitting rate limits.
- Consider implementing error handling and logging for production-level applications to track issues and performance.
- Regularly review and optimize your fetching and updating logic to ensure efficiency and accuracy.

Feel free to modify the code snippets to fit your specific use case and database structure!
