American Apparel — северо-американский производитель и продавец модной одежды, базирующийся в Лос Анжелесе, Калифорния. Компания была основана в 1989 году канадским бизнесменом Довом Чарни. Парсер, представленный в этой статье, позволит вам собрать информацию о товарах, представленных в интернет-магазине компании: americanapparel.net.
Примерное количество товаров: 500
Примерное количество запросов: 500
Рекомендуемый план подписки: Free
ВНИМАНИЕ! Количество запросов может превышать количество товаров, потому что данные о вариациях, изображениях и др. могут парсится используя запросы к дополнительным ресурсам. Также часть данных о товаре может доставляться с помощью XHR запросов, что также увеличивает общее количество необходимых запросов.
Для его использования вы должны иметь учетную запись в нашем сервисе Diggernaut.
- Пройдите по этой ссылке для регистрации в сервисе Diggernaut
- После регистрации и подтверждения email адреса войдите в свою учетную запись
- Создайте проект с любый именем и описанием, если вы не знаете как, обратитесь к нашей документации
- Войдите во вновь созданный проект и создайте в нем диггер с любым именем, если вы не знаете как, обратитесь к нашей документации
- Скопируйте в буфер обмена приведенный ниже сценарий диггера и вставьте его в созданный вами диггер, если вы не знаете как, обратитесь к нашей документации
- Переключите режим работы диггера с Debug на Active, если вы не знаете как, обратитесь к нашей документации
- Запустите ваш диггер и дождитесь окончания его работы, если вы не знаете как, обратитесь к нашей документации
- Скачайте собранный набор данных в нужном вам формате, если вы не знаете как, обратитесь к нашей документации
В дальнейшем вы можете установить расписание для запуска вашего парсера и забирать информацию регулярно.
Сценарий парсера:
---
config:
debug: 2
agent: Firefox
do:
- link_add:
url: http://store.americanapparel.net
- link_add:
url: http://store.americanapparel.net/en/factory-store/
- walk:
to: links
do:
- find:
path: .cd-primary-nav a
do:
- parse:
attr: href
- normalize:
routine: url
- link_add:
pool: main
- walk:
to: links
pool: main
do:
- find:
path: .product > a
do:
- parse:
attr: href
- normalize:
routine: url
- link_add:
pool: sub
- walk:
to: links
pool: sub
do:
- sleep: 3
- find:
path: .pdp
do:
- variable_clear: allli
- variable_clear: descr
- variable_clear: li
- variable_clear: id
- variable_clear: views
- variable_clear: color
- variable_clear: imgnum
- variable_clear: imgxl
- variable_clear: viewsnum
- variable_clear: stp
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- static_get: url
- object_field_set:
object: product
field: url
- find:
in: doc
path: head meta[name="description"]
do:
- parse:
attr: content
- space_dedupe
- trim
- to_block
- node_replace:
path: br
with: "\n"
- split:
context: text
delimiter: \n+
- find:
path: div.splitted
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: description
- find:
path: .product-style
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: sku
- find:
path: .price
do:
- find:
path: .red-text
do:
- parse:
filter:
- (\d+\.?\d*)
- if:
match: (\d)
do:
- object_field_set:
object: product
field: price
type: float
- register_set: USD
- object_field_set:
object: product
field: currency
- register_set: 1
- variable_set: stp
- find:
path: span[data-test="test"]
do:
- variable_get: stp
- if:
match: (1)
else:
- parse:
filter:
- (\d+\.?\d*)
- if:
match: (\d)
do:
- object_field_set:
object: product
field: price
type: float
- register_set: USD
- object_field_set:
object: product
field: currency
- find:
path: .product-name
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: .main-img
do:
- parse:
attr: src
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: .logo
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: brand
- find:
path: .breadcrumbs a
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: category
joinby: "|"
- find:
path: '.product-details > input#skuVarData'
do:
- parse:
attr: value
- normalize:
routine: replace_substring
args:
\s+\/\s+: _
- normalize:
routine: json2xml
- to_block
- find:
path: body_safe > name
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: colors
do:
- find:
path: zoomimage
do:
- parse:
filter:
- \s*(.+)\?
- variable_set: imgxl
- register_set: <%imgxl%>?$ProductZoom$
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: name
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: variations
joinby: "|"
- object_save:
name: product
Ниже приведен пример датасета с несколькими товарами в формате JSON (для наглядности). Датасет может быть скачан и как CSV, XLSX, XML, и любой другой текстовый формат используя темплейтный подход.
[{
"product": {
"brand": "American Apparel ®",
"category": "Women|Multipacks",
"currency": "USD",
"date": "2017-12-05T18:06:21.973Z",
"description": "The 50/50 Crewneck T-Shirt is a super-soft Poly-Cotton t-shirt featuring a slightly scooped neck and perfectly worn feel.",
"images": "http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_white?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_white?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_asphalt?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_black?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_gold?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_kellygreen?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_navy?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_orchid?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_pink?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_red?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_truffle?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb301w_white?$ProductZoom$",
"name": "50/50 Crewneck T-Shirt",
"price": 18,
"sku": "bb301w",
"url": "http://www.americanapparel.com/en/50-50-crewneck-t-shirt_bb301w?c=White",
"variations": "Asphalt|Black|Gold|Kelly Green|Navy|Orchid|Pink|Red|Truffle|White"
}
}
,{
"product": {
"brand": "American Apparel ®",
"category": "Women|T-Shirts & Tanks|Tanks",
"currency": "USD",
"date": "2017-12-05T18:06:25.305Z",
"description": "The 50/50 tank is a sexy tank with generously cut arm openings and a slim racerback in our super-soft Poly-Cotton fabric.",
"images": "http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_navy?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_navy?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_asphalt?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_black?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_gold?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_kellygreen?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_navy?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_orchid?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_pink?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_red?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_truffle?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/bb308w_white?$ProductZoom$",
"name": "50/50 Tank",
"price": 16,
"sku": "bb308w",
"url": "http://www.americanapparel.com/en/50-50-tank_bb308w?c=Navy",
"variations": "Asphalt|Black|Gold|Kelly Green|Navy|Orchid|Pink|Red|Truffle|White"
}
}
,{
"product": {
"brand": "American Apparel ®",
"category": "Women|Basics Shop",
"currency": "USD",
"date": "2017-12-05T18:06:28.613Z",
"description": "The 50/50 Loose Crop Tee is a loose-fitting cropped t-shirt in our ultra-soft 50/50 Poly-Cotton blend. Perfect for layering or paired with high-waist skirts, pants and shorts.",
"images": "http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_white?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_white?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_black?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_navy?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_orchid?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_pink?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_red?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/rsabb380w_white?$ProductZoom$",
"name": "50/50 Loose Crop Tee",
"price": 18,
"sku": "rsabb380w",
"url": "http://www.americanapparel.com/en/50-50-loose-crop-tee_rsabb380w?c=White",
"variations": "Black|Navy|Orchid|Pink|Red|White"
}
}
,{
"product": {
"brand": "American Apparel ®",
"category": "Women|Multipacks",
"currency": "USD",
"date": "2017-12-05T18:06:31.899Z",
"description": "The Tri-Blend Racerback Tank is a sexy tank with generously cut arm openings and a slim racerback in our ultra soft Tri-Blend fabric. • Polyester retains shape and elasticity; Cotton lends both comfort and durability; addition of Rayon makes for a unique texture and drapes against the body for a slimming look",
"images": "http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-lieutenant?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-lieutenant?defaultImage=/notavail&$ProductImage2.5$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_athleticblue?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_athleticgrey?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-black?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-creolepink?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-indigo?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-lieutenant?$ProductZoom$|http://s7d9.scene7.com/is/image/AmericanApparel/tr308w_tri-red?$ProductZoom$",
"name": "Tri-Blend Racerback Tank",
"price": 18,
"sku": "tr308w",
"url": "http://www.americanapparel.com/en/tri-blend-racerback-tank_tr308w?c=Tri-Lieutenant",
"variations": "Athletic Blue|Athletic Grey|Tri-Black|Tri-Creole Pink|Tri-Indigo|Tri-Lieutenant|Tri-Red"
}
}]