AliExpress order list aggregator script and more (#R8)

Things you might already know: AliExpress is a shitty website.

Don’t get me wrong, I shop a lot at AE. But the page is extremely slow due to massive amounts of JavaScript bloat, and aside from JS, it’s hacked together like I would do for a quick proof of concept. That shit would never fly with me, but for AE, it just does the job. When adding bling and JS spinners has priority over code quality…

It also annoys the fuck out of me for resetting to German translation pretty frequently. I. DO. NOT. WANT. YOUR. FUCKING. TRANSLATION. DC-DC buck modules are not money modules. No-clean flux isn’t dirty flux. And so on. It’s often hard enough to guess the chinglish of the seller, but it doesn’t get ANY BETTER when translating it again. Garbage in, garbage out.

It’s apparently fixed now, but for a significant part of 2019, AE also switched me to Russian. Not straight away, but in a sneaky way. I was able to shop on the global website (once I escaped German), you know, searching items, putting them into your shopping cart, buying. At the very moment of hitting the “Buy” button, it switched me to Russian. Which means that any notification on that very order is also in Russian. I got quite a lot of “Платёж совершён” (“payment completed”) mails, which do not carry the order number like the generic English ones do, e.g. “Your order #(long number ending in 234) has been paid successfully”.

Speaking of notifications: Since the “new” communication system was implemented and they didn’t manage to transfer old messages to it, I also never got a single mail on new messages from a seller. The subscription page says it’s enabled (not sure if that includes shops that I haven’t bought from yet), but it’s not. Oh, and you might want to fix the crappy i18n place holder for the most commonly used foreign language on your platform…

Now for the reason to write this rant: Order confirmation mails are utterly useless.

I got 300 of those in a Thunderbird subfolder, and I get jack when searching for anything. What’s the point in sending these? It’s a pure waste of bandwidth and storage space.

For a while, I copied the product description to a self addressed mail in order to keep some details, like pin mappings. That’s certainly useful but also takes a lot of time; also formatting gets lost, so saving as HTML would be nicer (but harder to search). I haven’t found a good solution yet.

The order search function on AE is also pretty useless. Sure, IF your know your order number you can jump straight to it, but if you only remember it was RGB strips it’s easier to scroll the pages by estimated transaction year than actually searching for any term that is altered three times before it is applied as search word. As the main item search still cannot perform searches with quotation marks or minus signs to get exact hits on your desired stuff without the tons of crap that “might be related”, I guess searching completed orders with regexes would blow up their datacenter. It’s also too much to ask for a downloadable list of orders or more than ten items per page, right…

So upon closer inspection I figured I could save all of my orders from the overview in HTML format in a single document with reasonable effort. Doesn’t save me datasheets, but as most orders are archived at a reasonable level (which is actually a great feature and much better than eBay!), that could do.

Here’s a step-by-step instruction of how to do this: (if you are a Linux or maybe OSX user)

1) Log into AE and open the order page.
2) Save the entire page with your browser into an empty folder, typical name “My AliExpress : Manage Orders##” where ## is the page number. We run a for loop over these names, so change accordingly if you choose different ones here
2.5) Lose your shit and rant about it when after everything is working, the order of the pages is suddenly messed up. Reason for this: The page title is not “My AliExpress : Manage Orders” with ordinary spaces (0x20). It’s “My 0xC2A0 AliExpress : Manage 0xC2A0 Orders”, 0xC2 0xA0 being the UTF8 non-breakable space that is only used between the words, the colon might stick to either of them. NON-BREAKABLE SPACES IN HTML TITLES – YOU MONSTERS. At that point, Windows users might just cry about their fucked up file system, reformat and start from scratch.
3) Repeat for all of your order pages and those that you have renamed afterwards. Or rename all of them afterwards. Or before. Just make them appear in order when running ls -l…
4) Save, chmod +x and run this nifty little script in the working directory:

#!/bin/bash
rm MyAEdata.txt
mkdir MyAEitems
for f in My*htm
do
basef=$(echo "$f" | cut -f 1 -d '.')
nr="${basef: -2 }"
folderstr=$basef"_files"
globalfolder="MyAEitems"
ff=$basef.raw
fff=$basef.raw2
ffff=$basef.raw3
echo "Processing $f (order page $nr)..."
cat "$f" | grep "order-item-wraper" -A 99999 > "${ff}"
cat "$ff" | grep "<tfoot>" -B 99999 > "${fff}"
sed -e "s/\xC2\xA0/ /g" -e "s/%C2%A0/ /g" -e "s/%20/ /g" -e "s/$folderstr/$globalfolder/g" -e '$d' "$fff" > "${ffff}"
cat "$ffff" >> MyAEdata.txt
cp "$folderstr"/*.jpg MyAEitems
cp "$folderstr"/*.css MyAEitems
rm "${ff}"
rm "${fff}"
rm "${ffff}"
done

I bet one can package this into a single line of high quality sed or awk code, but I’m not the guy that wants to do this for fun. I never got LPIC-1 tested/certified because the local test provider wasn’t able to schedule a single appointment over the course of an entire years…aaand then the edu discount code expired. Fuck you, $verylargeITserviceprovider!
Also big THX to the online HTML escaping tool at FreeFormatter.com, code in WordPress just appears as bare/formatted code that likely gets interpreted by your browser.

5) Now you’re left with the data block in MyAEdata.txt and the source HTML files. Duplicate one of them as a donor and implant the block. Currently that’s starting at line 649, but that may obviously change over time. Searching for the table header at <table id=”buyer-ordertable” class=”util-clearfix”> is a good start. The data itself starts in a tbody class “order-item-wraper ” (yes, WRAPER, and yes, that’s a space at the end, because the last element has class “order-item-wraper last-tbody”…)
6) Open and enjoy your complete list of AE purchases. Save/move with the corresponding folder to preserve the CSS and images associated. If you want a different data folder, change the globalfolder=”MyAEitems” entry in the script

Minimal working outer shell:


<!DOCTYPE html>
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<title>My AliExpress : Manage Orders</title>
<link rel="stylesheet" type="text/css" href="MyAEitems/a.css">
</head>
<body data-spm="9042311" class="aliexpress lang-en">
<link href="MyAEitems/me-header.css" rel="stylesheet" type="text/css">
<div class="grid-col-container">

<div class="me-ui-box">
<table id="buyer-ordertable" class="util-clearfix">
<colgroup><col class="selector">
<col class="product-name"> <col class="product-action"> <col class="order-status"> <col class="order-action">
</colgroup>
<thead>
<tr class="product-table-title">
<th class="selector" style="display:none;"></th><th class="product-name" colspan="2" style="padding-left:10px;">Product </th><th class="product-action">Product Action</th><th class="order-status">Order Status</th><th class="order-action">Order Action</th>
</tr>
</thead>

<!-- your data block here -->

</table>
</div>
</div>
</body></html>

Of course being a numberphile that list opens up a few data processing opportunities…

For sake of simplicity order values have been converted to the local currency (€), with fixed conversion rate of USD 1.15 = 1.00 EUR. I actually have never paid in GBP or other currencies on AE! Large orders do not include local VAT or credit card fees. “Item” is defined as amount of items in the shopping cart, e.g. a 100 pack of LEDs equals one item, not 100. Transaction time has been converted to local CET/CEST time, but the transaction day could be off.

Number of orders: 281
Number of items: 299
Item breakdown: 233x single item (78%), 49x two items (16%), 6×3, 6×4, 3×5, 1×6, 1×9
Number of disputes: 20 (7.1% = 1 in 14)

Min order value: 0.42€ (two USB-C to Micro adapters)
Max order value: 164.31€ (the LEDs from the growlight repair; only this and four others were above the VAT-free order limit of 22€)
Avg order value: 5.54€ (I don’t trust the LO calculated standard deviation of 12€)
Total order value: 1558€

Date of first order: 28.07.2015 (1468 days ago)
Date of most recent order: today :mrgreen: 04.08.2019
Orders per day: 0.19 = one order per 5 days, 5h 23min
Order value per day: 1.06€

Most orders per day: 5 (18.02.2019 and 20.03.2019)
Orders by hour of day:
00:00 to 01:00: 22x
01:00 to 02:00: 17x
02:00 to 03:00: 9x
03:00 to 04:00: 1x (03:58)
04:00 to 05:00: zip!
05:00 to 06:00: nada!
06:00 to 07:00: nüscht!
07:00 to 08:00: 1x (07:06)
08:00 to 09:00: 2x
09:00 to 10:00: 6x
10:00 to 11:00: 1x
11:00 to 12:00: 4x
12:00 to 13:00: 14x
13:00 to 14:00: 1x
14:00 to 15:00: 12x
15:00 to 16:00: 11x
16:00 to 17:00: 13x
17:00 to 18:00: 15x
18:00 to 19:00: 11x
19:00 to 20:00: 16x
20:00 to 21:00: 24x
21:00 to 22:00: 36x
22:00 to 23:00: 33x
23:00 to 24:00: 32x

Oh, and the entire PNG export via the otherwise excellent Screengrab extension for Firefox is a cool 32760px in height – and it’s cropped at about 50% / December 2017! Even though I’m not buying sex toys like bigclive, I’m only comfortable sharing the scaled down version at 10kpx… ;)

Wasting half a Sunday with bash scripts and nonsense statistics: Check.
Would do again, 8/10. :roll:


Leave a Reply

Your email address will not be published. Required fields are marked *

:mrgreen: 
:neutral: 
:twisted: 
:arrow: 
:shock: 
:smile: 
:???: 
:cool: 
:evil: 
:grin: 
:idea: 
:oops: 
:razz: 
:roll: 
;-) 
:cry: 
:eek: 
:lol: 
:mad: 
:sad: 
:suspect: 
:!: 
:?: 
:bye: 
:good: 
:negative: 
:scratch: 
:wacko: 
:yahoo: 
:heart: 
B-) 
:rose: 
:whistle: 
:yes: 
:cry2: 
:mail: 
:-(( 
:unsure: 
:wink: