# Initialization and setting up

**When you extract links from a web page, you often end up with a lot of irrelevant URLs. Learn how to filter the links to only keep the ones you need.**

Archived course

This is an archive of our old course. Check out our new [Web scraping basics for JavaScript devs](https://pr-2338.preview.docs.apify.com/academy/scraping-basics-javascript.md) course instead! We plan to completely retire this old course in a few months.

***

The Crawlee CLI speeds up the process of setting up a Crawlee project. Navigate to the directory you'd like your project's folder to live, then open up a terminal instance and run the following command:


```
npx crawlee create amazon-crawler
```


Once you run this command, you'll get prompted into a menu which you can navigate using your arrow keys. Each of these options will generate a different boilerplate code when selected. We're going to work with CheerioCrawler today, so we'll select the **CheerioCrawler template project** template, and then press **Enter**.

![Crawlee CLI \&quot;create\&quot; command](/assets/images/crawlee-create-2f29a900f2ab57f734d05c25cc37feed.png)

Once it's completed, open up the **amazon-crawler** folder that was generated by the `npx crawlee create` command. We're going to modify the **main.js** boilerplate to fit our needs:


```
// main.js
import { CheerioCrawler, KeyValueStore, log } from 'crawlee';
import { router } from './routes.js';

// Grab our keyword from the input
const { keyword } = await KeyValueStore.getInput();

const crawler = new CheerioCrawler({
    requestHandler: router,

    // If you have access to Apify Proxy, you can use residential proxies and
    // high retry count which helps with blocking
    // If you don't, your local IP address will likely be fine for a few requests if you scrape slowly.
    // proxyConfiguration: await Actor.createProxyConfiguration({ groups: ['RESIDENTIAL'] }),
    // maxRequestRetries: 10,
});

log.info('Starting the crawl.');
await crawler.run([{
    // Turn the keyword into a link we can make a request with
    url: `https://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=${keyword}`,
    label: 'START',
    userData: {
        keyword,
    },
}]);
log.info('Crawl finished.');
```


```
// routes.js
import { createCheerioRouter } from 'crawlee';

export const router = createCheerioRouter();

router.addDefaultHandler(({ log }) => {
    log.info('Route reached.');
});
```


Finally, we'll add the following input file to **INPUT.json** in the project's root directory (next to `package.json`, `node_modules` and others)


```
{
  "keyword": "iphone"
}
```


> This is how we'll be inputting data into our scraper from now on. Don't worry though, from now on, we'll only need to work in the **main.js** and **routes.js** files!

## Next up

Cool! We're ready. But [let's discuss a bit about modularity](https://pr-2338.preview.docs.apify.com/academy/scraping-basics-javascript/legacy/challenge/modularity.md) before moving forward!