Running the Sync Profile
With prerequisites met and the .env file configured, start the synchronous scraping profile using Docker Compose.
The synchronous profile will start the PostgreSQL database and the StreetEasy scraper application, configured to use BrightData’s Web Unlocker in a synchronous, multi-threaded manner.
Starting the Services
Section titled “Starting the Services”In the project root directory, run:
docker compose --profile sync up --builddocker compose: The command to interact with Docker Compose.--profile sync: This flag tells Docker Compose to only include the services defined under thesyncprofile in yourdocker-compose.yml. This will include thedb,prestart, andsync_scraperservices.up: This command builds, creates, starts, and attaches to containers for a service.--build: This flag ensures that the Docker images are built before starting the containers.
What to Expect
Section titled “What to Expect”When you run the command, Docker Compose will:
- Build the necessary Docker images for the services in the
syncprofile (db,prestart, andsync_scraper). - Create and start the
db(PostgreSQL) container. - Wait for the
dbservice to pass its health check, ensuring the database is ready. - Run the
prestartservice. This service initializes the database tables using SQLAlchemy and runs theaddress_scraperscript to seed the database with initial addresses. - Wait for the
prestartservice to complete successfully. - Start the
sync_scrapercontainer. - Attach to the logs of the running containers, so you will see the output directly in your terminal.
You should see output indicating the startup of the db, the execution and completion of prestart, and finally the logs from sync_scraper. The sync_scraper will then connect to the database and begin processing the addresses.
The docker-compose.override.yml ensures that the database port 5432 on your host machine is mapped to the database container’s port for you to access locally.
Next Steps
Section titled “Next Steps”Congratulations! You have successfully completed the “Getting Started” quickstart by running the synchronous scraping profile.
To further explore the capabilities of the streeteasy-scraper and learn about the more scalable asynchronous scraping workflow, consult the following sections of the documentation:
- System Design and Architecture: Understand how the different components of the project work together.
- Guides: Learn how to set up and run the asynchronous profile, provide different addresses for scraping, and access/report on the scraped data.
These sections will provide you with the knowledge to leverage the full power of the project.