Anthropologie — американский ретейлер одежды. В настоящее время компания управляет более чем 200 магазинами по всему миру и предлагает тщательно отобранный ассортимент одежды, украшений, нижнего белья, товаров для дома и декора, товаров для красоты и подарков. В августе 1992 года Ричарду Хейну пришла в голову идея открыть магазин одежды для креативных и образованных женщин в возрасте 30-45 лет, так появился магазин Anthropologie. Этот парсер товаров онлайн магазина предназначен для сбора информации о товарах, представленных на сайте магазина anthropologie.com.
Примерное количество товаров: 50000
Примерное количество запросов: 50000
Рекомендуемый план подписки: Small
ВНИМАНИЕ! Количество запросов может превышать количество товаров, потому что данные о вариациях, изображениях и др. могут парсится используя запросы к дополнительным ресурсам. Также часть данных о товаре может доставляться с помощью XHR запросов, что также увеличивает общее количество необходимых запросов.
Для его использования вы должны иметь учетную запись в нашем сервисе Diggernaut.
- Пройдите по этой ссылке для регистрации в сервисе Diggernaut
- После регистрации и подтверждения email адреса войдите в свою учетную запись
- Создайте проект с любый именем и описанием, если вы не знаете как, обратитесь к нашей документации
- Войдите во вновь созданный проект и создайте в нем диггер с любым именем, если вы не знаете как, обратитесь к нашей документации
- Скопируйте в буфер обмена приведенный ниже сценарий диггера и вставьте его в созданный вами диггер, если вы не знаете как, обратитесь к нашей документации
- Переключите режим работы диггера с Debug на Active, если вы не знаете как, обратитесь к нашей документации
- Запустите ваш диггер и дождитесь окончания его работы, если вы не знаете как, обратитесь к нашей документации
- Скачайте собранный набор данных в нужном вам формате, если вы не знаете как, обратитесь к нашей документации
В дальнейшем вы можете установить расписание для запуска вашего парсера и забирать информацию регулярно.
Сценарий парсера:
---
config:
debug: 2
agent: Firefox
do:
- walk:
to: https://www.anthropologie.com
do:
- find:
path: .c-main-navigation__li--level-1
do:
- find:
path: span
slice: 0
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat1
- find:
path: .c-main-navigation__li--level-2
do:
- variable_clear: subcat
- find:
path: .c-main-navigation__a--level-2
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat2
- find:
path: .c-main-navigation__li--level-3 a
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat3
- variable_set:
field: subcat
value: 1
- parse:
attr: href
- pool_clear: main
- link_add:
pool: main
- walk:
to: links
pool: main
do:
- find:
path: .js-pagination__arrow--next
slice: 0
do:
- parse:
attr: href
- link_add:
pool: main
- find:
path: .c-product-tile__image-link
do:
- parse:
attr: href
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- walk:
to: value
do:
- find:
path: body
do:
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- register_set: Anthropologie
- object_field_set:
object: product
field: brand
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: meta[> img.c-product-image
do:
- parse:
attr: src
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: script:matches(window\.productData)
do:
- parse:
filter:
- window.productData\s*=\s*\'\s*(.+)\s*\'\s*;
- normalize:
routine: Base64ZLIBDecode
- normalize:
routine: json2xml
- to_block
- find:
path: body_safe
do:
- find:
path: primaryslice:hasChild(displaylabel:matches(Color))
do:
- find:
path: sliceitems > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: variations
joinby: "|"
- find:
path: sliceitems
do:
- variable_clear: iid
- find:
path: id
slice: 0
do:
- parse
- variable_set: iid
- find:
path: images
do:
- parse
- register_set: http://images.anthropologie.com/is/image/Anthropologie/_
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: product > stylenumber
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: sku
- find:
path: product > product > brand
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: brand
- find:
path: product > product > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: product > product > longdescription
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: description
- variable_get: cat1
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat2
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat3
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- object_save:
name: product
- variable_get: subcat
- if:
match: (1)
else:
- find:
path: .c-main-navigation__a--level-2
do:
- parse:
attr: href
- pool_clear: main
- link_add:
pool: main
- walk:
to: links
pool: main
do:
- find:
path: .js-pagination__arrow--next
slice: 0
do:
- parse:
attr: href
- link_add:
pool: main
- find:
path: .c-product-tile__image-link
do:
- parse:
attr: href
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- walk:
to: value
do:
- find:
path: body
do:
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- register_set: Anthropologie
- object_field_set:
object: product
field: brand
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: meta[> img.c-product-image
do:
- parse:
attr: src
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: script:matches(window\.productData)
do:
- parse:
filter:
- window.productData\s*=\s*\'\s*(.+)\s*\'\s*;
- normalize:
routine: Base64ZLIBDecode
- normalize:
routine: json2xml
- to_block
- find:
path: body_safe
do:
- find:
path: primaryslice:hasChild(displaylabel:matches(Color))
do:
- find:
path: sliceitems > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: variations
joinby: "|"
- find:
path: sliceitems
do:
- variable_clear: iid
- find:
path: id
slice: 0
do:
- parse
- variable_set: iid
- find:
path: images
do:
- parse
- register_set: http://images.anthropologie.com/is/image/Anthropologie/_
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: product > stylenumber
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: sku
- find:
path: product > product > brand
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: brand
- find:
path: product > product > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: product > product > longdescription
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: description
- variable_get: cat1
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat2
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat3
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- object_save:
name: product
Ниже приведен пример датасета с несколькими товарами в формате JSON (для наглядности). Датасет может быть скачан и как CSV, XLSX, XML, и любой другой текстовый формат используя темплейтный подход.
[{
"product": {
"brand": "Illume",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:15:58.241Z",
"description": "New from the fragrance masters at Illume, Anatomy of a Fragrance bath and beauty products are sophisticated, lighthearted luxuries. Each is crafted in Minnesota, where Illume combines their signature scents with beautiful packaging designed in-house. From lavish hand creams to triple-milled soaps to nature-inspired perfumes, their line is ready-made for gifting and indulging. **Honey Rose**: a warm, romantic scent with notes of lily of the valley, sandalwood and bergamot **Orchid Vanille**: a bright, fresh combination of orange blossom, jasmine, black currant and praline **Wildflower Bergamot**: A zesty blend of bergamot, lemon and mango layered with cedar and sandalwood",
"images": "https://images.anthropologie.com/is/image/Anthropologie/44448363_040_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_040_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_070_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_065_b",
"name": "Anatomy of a Fragrance Gift Set",
"sku": "44448363",
"url": "https://www.anthropologie.com/shop/anatomy-of-a-fragrance-gift-set",
"variations": "Wildflower Bergamot|Orchid Vanille|Honey Rose"
}
}
,{
"product": {
"brand": "Capri Blue",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:15:59.713Z",
"description": "Capri Blue's iconic vessels and fragrances - proudly designed and poured in Mississippi - are a long-standing favorite at Anthropologie. The line pairs striking visuals with intoxicating scents to create beautifully aromatic products like soy-blended candles and vegan-formulated beauty care. **Volcano**: tropical fruits, sugared oranges, lemons and limes, redolent with lightly exotic mountain greens **Coastal**: notes of pineapple, verbena and coconut, accented by sparkling lemon, bergamot and grapefruit **Fir & Firewood**: a fruity, green aroma of apple, clove, fir, pine needle, white birch, cedar, vetiver and musk **Japanese Quince & Cedar**: aromatic cedar wood is embellished with sun-ripened cassis, sugared quince, accents of red currant and a splash of sparkling pomelo **Gardenia & Fig**: bright greens and fresh peach mingle with gardenia, rose, ylang ylang and coconut over a base of light musk **Cinnamon Toddy**: a mouthwatering medley of ripe apple, warm cinnamon, golden clove and grated nutmeg topped with notes of honey and maple **Spiced Cider**: nutmeg, clove and cinnamon are layered over fresh apple and juicy orange notes **Lagoon**: top notes of freesia, incense and tamarind blend over a musky base of cashmere, wood and vetiver **Grapefruit Neroli**: sun-kissed grapefruit, quince and tangerine over neroli, vanilla, orchid and currant",
"images": "https://images.anthropologie.com/is/image/Anthropologie/19851559_033_b|https://images.anthropologie.com/is/image/Anthropologie/19851559_033_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_033_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_033_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b15|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b16|http://images.anthropologie.com/is/image/Anthropologie/19851559_049_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_026_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_098_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_040_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_007_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_007_b2",
"name": "Capri Blue Iridescent Jar Candle",
"sku": "19851559",
"url": "https://www.anthropologie.com/shop/capri-blue-iridescent-jar-candle8",
"variations": "Fir and Firewood|Spiced Cider|Volcano|Spiced Cider|Fir and Firewood|Volcano|Volcano"
}
}
,{
"product": {
"brand": "Anthropologie",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:16:00.340Z",
"images": "https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b3|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b2|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b15|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b15|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b10|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b10",
"name": "Slivered Geode Coaster",
"sku": "39336862",
"url": "https://www.anthropologie.com/shop/geode-coaster",
"variations": "Black Quartz|Dyed Citron|White Quartz|Adventurian|Dyed Blue|Dyed Magenta|Amethyst|Rose quartz"
}
}
,{
"product": {
"brand": "Floreat",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:16:01.211Z",
"images": "https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b2|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b3|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b4|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b2|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b3|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b4|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b2|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b3|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b4",
"name": "Floreat Printed Sleep Pants",
"sku": "43663541",
"url": "https://www.anthropologie.com/shop/floreat-printed-sleep-pants",
"variations": "ASSORTED|BLUE MOTIF"
}
}]