Наш парсинг-сервис Diggernaut поможет вам легко парсить товары и цены в интернет-магазине chanel.com, используя приведенный ниже парсер. Компания Chanel была основана в начале двадцатого века модельером Коко Шанель, а первый бутик был открыт в 1910 году в Париже. В 1924 году компания запустила производство парфюмерии. В настоящий момент компания Chanel специализируется на производстве и продаже одежды, предметов роскоши, парфюмерии и косметики.
Примерное количество товаров: 1000
Примерное количество запросов: 1000
Рекомендуемый план подписки: Free
ВНИМАНИЕ! Количество запросов может превышать количество товаров, потому что данные о вариациях, изображениях и др. могут парсится используя запросы к дополнительным ресурсам. Также часть данных о товаре может доставляться с помощью XHR запросов, что также увеличивает общее количество необходимых запросов.
Как пользоваться парсером интернет-магазина Chanel
Для использования парсера информации с сайта магазина вы должны иметь учетную запись в нашем сервисе Diggernaut.
- Пройдите по этой ссылке для регистрации в сервисе Diggernaut
- После регистрации и подтверждения email адреса войдите в свою учетную запись
- Создайте проект с любый именем и описанием, если вы не знаете как, обратитесь к нашей документации
- Войдите во вновь созданный проект и создайте в нем диггер с любым именем, если вы не знаете как, обратитесь к нашей документации
- Скопируйте в буфер обмена приведенный ниже сценарий диггера и вставьте его в созданный вами диггер, если вы не знаете как, обратитесь к нашей документации
- Переключите режим работы диггера с Debug на Active, если вы не знаете как, обратитесь к нашей документации
- Запустите ваш диггер и дождитесь окончания его работы, если вы не знаете как, обратитесь к нашей документации
- Скачайте собранный набор данных в нужном вам формате, если вы не знаете как, обратитесь к нашей документации
В дальнейшем вы можете установить расписание для запуска вашего парсера и забирать информацию регулярно.
Сценарий парсера
---
config:
debug: 2
agent: Firefox
do:
- link_add:
pool: beauty
url:
- https://www.chanel.com/en_US/fragrance-beauty/fragrance-beauty-skincare-140910
- walk:
to: links
pool: beauty
do:
- find:
path: a.product-link
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: beauty
- find:
path: a:haschild(div.top_header[role="button"])
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: beauty
- find:
path: form.product_container ul.unstyled>li.img>a
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: beautypages
- walk:
to: links
pool: beautypages
do:
- sleep: 2
- find:
path: 'div#contentContainer'
do:
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: h1[itemprop="name"]
do:
- parse
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: name
- find:
path: div.cc-sku-selector-dropdown select>option
slice: 0
do:
- parse:
attr: value
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: sku
- register_set: Chanel
- object_field_set:
object: product
field: brand
- find:
path: div.cc-product-options-price
do:
- find:
path: span[itemprop="priceCurrency"]
do:
- parse:
attr: content
- space_dedupe
- trim
- object_field_set:
object: product
field: currency
- find:
path: span[itemprop="price"]
do:
- parse:
filter: (\d+\.\d+)
- space_dedupe
- trim
- object_field_set:
object: product
type: float
field: price
- find:
path: div.cc-sku-selector-dropdown select>option
do:
- parse
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- object_field_set:
object: product
joinby: "|"
field: variations
- find:
in: doc
path: script:contains('window.__CC_STATE__')
do:
- parse:
filter: window\.__CC_STATE__\s*\=\s*(.+)\;
- normalize:
routine: json2xml
- to_block
- find:
path: images src
do:
- parse
- if:
match: \w+
do:
- normalize:
routine: url
- object_field_set:
object: product
joinby: "|"
field: images
- find:
path: product>description
slice: 0
do:
- parse
- to_block
- find:
path: p
slice: 1
do:
- parse
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: description
- find:
path: span.breadcrumb>a
do:
- parse
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- object_field_set:
object: product
joinby: "|"
field: categories
- object_save:
name: product
- link_add:
pool: sun
url:
- https://www.chanel.com/en_US/fashion/sunglasses/products/
- walk:
to: links
pool: sun
do:
- find:
path: a.ui-pagination-next
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: sun
- find:
path: ul.product-list>li
do:
- find:
path: li.item
slice: 0
do:
- find:
path: a
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: sunpages
- walk:
to: links
pool: sunpages
do:
- sleep: 2
- find:
path: main[role="main"]
do:
- variable_clear: pid
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: h1.tt-1
do:
- parse
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: name
- find:
path: input[name="pdt-sku"]
do:
- parse:
attr: value
- space_dedupe
- trim
- if:
match: \w+
do:
- variable_set: pid
- object_field_set:
object: product
field: sku
- register_set: Chanel
- object_field_set:
object: product
field: brand
- find:
path: span[property="price"]
do:
- parse:
filter: (\d+)
- object_field_set:
object: product
type: float
field: price
- parse
- normalize:
routine: replace_matched
args:
\$: USD
- object_field_set:
object: product
field: currency
- find:
path: select[data-select="pdt-color"]>option
do:
- parse
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- object_field_set:
object: product
joinby: "|"
field: variations
- find:
in: doc
path: meta[name="description"]
do:
- parse:
attr: content
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: description
- find:
path: div.breadcrumb>a
slice: 1:-2
do:
- parse
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- object_field_set:
object: product
joinby: "|"
field: categories
- walk:
to: https://www.chanel.com/en_US/fashion/sunglasses/pdpjson/<%pid%>/product
do:
- find:
path: script
do:
- parse
- normalize:
routine: replace_substring
args:
^window\.: ''
- to_block
- parse
- eval:
routine: js
body: (function () {var <%register%>; return JSON.stringify(product);})();
- normalize:
routine: json2xml
- to_block
- find:
path: zoom
do:
- parse:
filter: ^(\S+)
- if:
match: \w+
do:
- normalize:
routine: url
- object_field_set:
object: product
joinby: "|"
field: images
- object_save:
name: product
- link_add:
url:
- https://www.chanel.com/en_US/watches-jewelry/fine-jewelry/collections
- https://www.chanel.com/en_US/watches-jewelry/watches/collections
- walk:
to: links
do:
- find:
path: div.product-item-wrapper>a
do:
- parse:
attr: href
- register_set: <%register%>?show=All
- walk:
to: value
do:
- find:
path: div.product-item-wrapper>a
do:
- parse:
attr: href
- if:
match: \w+
do:
- normalize:
routine: url
- link_add:
pool: pages
- walk:
to: links
pool: pages
do:
- sleep: 2
- find:
path: 'main#page-content'
do:
- variable_clear: pid
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: dl>dt:contains("Name:")+dd
do:
- parse
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: name
- find:
path: dl>dt:contains("Reference:")+dd
do:
- parse
- space_dedupe
- trim
- if:
match: \w+
do:
- variable_set: pid
- object_field_set:
object: product
field: sku
- register_set: Chanel
- object_field_set:
object: product
field: brand
- find:
path: product_price
do:
- parse
- object_field_set:
object: product
type: float
field: price
- register_set: USD
- object_field_set:
object: product
field: currency
- find:
in: doc
path: meta[name="description"]
do:
- parse:
attr: content
- space_dedupe
- trim
- if:
match: \w+
do:
- object_field_set:
object: product
field: description
- find:
path: 'nav#breadcrumb>ul>li:not(.visually-hidden)>a'
slice: 1:-1
do:
- parse
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- object_field_set:
object: product
joinby: "|"
field: categories
- find:
path: div.product-images figure>a
do:
- parse:
attr: href
- space_dedupe
- trim
- if:
match: \w{2,}
do:
- normalize:
routine: url
- object_field_set:
object: product
joinby: "|"
field: images
- object_save:
name: product
Пример набора данных, собранных парсером с сайта Chanel
Ниже приведен пример датасета с несколькими товарами в формате JSON (для наглядности). Датасет может быть скачан и как CSV, XLSX, XML, и любой другой текстовый формат используя темплейтный подход.
[{
"product": {
"brand": "Chanel",
"categories": "Fragrance|Women|Allure Sensuelle",
"currency": "USD",
"date": "2017-12-27T12:44:54.948Z",
"description": "Like the charismatic, passionate presence of Gabrielle Chanel, ALLURE SENSUELLE is the modern, magnetic fragrance of a true, radiant and intense woman. The floral-soft-Oriental fragrance is revealed in a unique way on every woman — because each woman has her own special allure.",
"images": "https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P129710/S129710_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P129710/S129720_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P129710/S129730_XLARGE.jpg",
"name": "ALLURE SENSUELLE EAU DE PARFUM SPRAY",
"price": 130,
"sku": "88316",
"url": "https://www.chanel.com/en_US/fragrance-beauty/fragrance-allure-sensuelle-allure-sensuelle-88314",
"variations": "3.4 FL. OZ.|1.7 FL. OZ.|1.2 FL. OZ."
}
}
,{
"product": {
"brand": "Chanel",
"categories": "Makeup|Lips|Lipstick",
"currency": "USD",
"date": "2017-12-27T12:45:00.308Z",
"description": "The intensity of a lipstick, the shine of a lipgloss and the comfort of a lip balm — all in one creamy yet lightweight formula.",
"images": "https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170202_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170206_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170208_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170212_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170214_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170216_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170218_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170217_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170222_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170224_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P170202/S170227_XLARGE.jpg",
"name": "ROUGE COCO STYLO COMPLETE CARE LIPSHINE",
"price": 37,
"sku": "141754",
"url": "https://www.chanel.com/en_US/fragrance-beauty/makeup-lipstick-rouge-coco-stylo-140392",
"variations": "217 PANORAMA - Limited Edition|218 SCRIPT|216 LETTRE|202 CONTE|227 ESQUISSE - Limited Edition|222 FICTION|206 HISTOIRE|208 ROMAN|214 MESSAGE|224 MÉMOIRE|212 RECIT"
}
}
,{
"product": {
"brand": "Chanel",
"categories": "Skincare|BY CATEGORY|Sun Protection",
"currency": "USD",
"date": "2017-12-27T12:45:04.800Z",
"description": "A breakthrough daily sunscreen that features an adaptive skincare technology for tailor-made defense from UVA and UVB rays, free radicals and pollution.",
"images": "https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P141836/S141836_XLARGE.jpg",
"name": "UV ESSENTIEL Multi-Protection Daily Defense Sunscreen Anti-Pollution Broad Spectrum SPF 30",
"price": 55,
"sku": "140249",
"url": "https://www.chanel.com/en_US/fragrance-beauty/skincare-sun-protection-uv-essentiel-140248",
"variations": "1 FL. OZ."
}
}
,{
"product": {
"brand": "Chanel",
"categories": "Makeup|Eyes|Mascara",
"currency": "USD",
"date": "2017-12-27T12:45:09.444Z",
"description": "A high-precision waterproof mascara that achieves instant volume and intense colour in a single stroke.",
"images": "https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P194210/S194210_XLARGE.jpg|https://www.chanel.com/en_US/fragrance-beauty/cms2export/Site1Files/P194210/S194220_XLARGE.jpg",
"name": "LE VOLUME DE CHANEL WATERPROOF MASCARA",
"price": 32,
"sku": "139065",
"url": "https://www.chanel.com/en_US/fragrance-beauty/makeup-mascara-le-volume-de-chanel-waterproof-139064",
"variations": "10 NOIR|20 BRUN"
}
}]