Running the Sync Profile
With prerequisites met and the .env
file configured, start the synchronous scraping profile using Docker Compose.
The synchronous profile will start the PostgreSQL database and the StreetEasy scraper application, configured to use BrightData’s Web Unlocker in a synchronous, multi-threaded manner.
Starting the Services
Section titled “Starting the Services”In the project root directory, run:
docker compose --profile sync up --build
docker compose
: The command to interact with Docker Compose.--profile sync
: This flag tells Docker Compose to only include the services defined under thesync
profile in yourdocker-compose.yml
. This will include thedb
,prestart
, andsync_scraper
services.up
: This command builds, creates, starts, and attaches to containers for a service.--build
: This flag ensures that the Docker images are built before starting the containers.
What to Expect
Section titled “What to Expect”When you run the command, Docker Compose will:
- Build the necessary Docker images for the services in the
sync
profile (db
,prestart
, andsync_scraper
). - Create and start the
db
(PostgreSQL) container. - Wait for the
db
service to pass its health check, ensuring the database is ready. - Run the
prestart
service. This service initializes the database tables using SQLAlchemy and runs theaddress_scraper
script to seed the database with initial addresses. - Wait for the
prestart
service to complete successfully. - Start the
sync_scraper
container. - Attach to the logs of the running containers, so you will see the output directly in your terminal.
You should see output indicating the startup of the db
, the execution and completion of prestart
, and finally the logs from sync_scraper
. The sync_scraper
will then connect to the database and begin processing the addresses.
The docker-compose.override.yml
ensures that the database port 5432
on your host machine is mapped to the database container’s port for you to access locally.
Next Steps
Section titled “Next Steps”Congratulations! You have successfully completed the “Getting Started” quickstart by running the synchronous scraping profile.
To further explore the capabilities of the streeteasy-scraper
and learn about the more scalable asynchronous scraping workflow, consult the following sections of the documentation:
- System Design and Architecture: Understand how the different components of the project work together.
- Guides: Learn how to set up and run the asynchronous profile, provide different addresses for scraping, and access/report on the scraped data.
These sections will provide you with the knowledge to leverage the full power of the project.