Export content from an ecommerce site without using the Backend - database

I have a site that I'm looking to transfer to Volusion. Importing tabled content into Volusion's a breeze, it's getting it tabled that's an issue. The old site has no real ability to export, nor do I know how to get at it's database. I'm thinking there must be some sort of script I can write to take the content from the frontend and download it in some sort of list that I can put into a CSV, and put into Volusion.
www.twincitygreetings.com
Any suggestions? I'm hoping to get in the image directory as well and download all them for upload to the new site.

You are going to need at the very least a file with product code, product name, weight and price.
Looking at the URL you provided it doesn't appear that the products their follow any type of orderly structure where you can target the images folder or products based on a known piece of information like a products code. Unless the back-end has some type of product export function you may have no choice but to recreate it from scratch.

I don't know if you solved this yet or not, but I would suggest scraping the data providing you have the information on the old site currently. This can be done easily using vbscript and excel, or if you aren't very savvy at coding you could look at a piece of software called mozenda. There are a whole variety of methods that can be used to scrape data, all of them pretty easy to learn with a bit of research. Basically you write a script that will crawl your dom and extract the data (to xml works best in my experience)
Hope this helps.

Related

How to download Amazon MWS Customisation fields

I'm looking into how to process customisation fields for Amazon orders and according to their MWS API Docs, if a customer chooses to personalise his order, then a URL to download this data comes down in the Order Item XML's BuyerCustomizedInfo node:
<OrderItem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<ASIN>ABC123</ASIN>
...
<ConditionSubtypeId>New</ConditionSubtypeId>
<BuyerCustomizedInfo>
<CustomizedURL>https://zme-caps.amazon.com/t/ABC123/ABC123/1</CustomizedURL>
</BuyerCustomizedInfo>
</OrderItem>
My client has given me two such orders to look at, and when I click on those links all I get is
NoSuchURL: Url id 'ABC123' has expired or does not exist!
I know that the ZIP will contain JSON which I will have to parse and may also contain references to SVGs, and that I must also make the code extra robust when dealing with customisation fields.
Am I getting this error because these links are time sensitive or one time use only? Or is it something else?
First off I'm not a Developer, I'm an Amazon Seller - I found your question while doing research as I'm trying to figure out what is possible and sketch a plan for a similar system and then hire a Developer.
I've pasted some info that I had found from the US below - the implementation of Amazon Custom in the European Marketplaces may not be the same as in the US though.
In general it is very hard to get good info on anything to do with Amazon Custom and it seems to have a messed up logic of its own - feel free to ask anything though and I will help if I can.
First of all, make sure you have the most up to date Amazon MWS Orders API SDK. If you don’t and refuse to update, you can make a reports API for orders, and that’ll include the ZIP URL, but you’ll have to parse it and life will be hell.
Next, for the order, call ListOrderItems which you probably already do. You’ll see the customization in the response XML under BuyerCustomizedInfo -> CustomizedURL.
This is a ZIP. Download the zip using CURL, put plenty of checks and fallbacks in place because it will fail sometimes.
Extract the ZIP to a folder. Inside that folder there will be a json file.
Parse that JSON file and you’ll probably know where to go from there for putting that information into your system.
Depending on how you’ve configured your product, there may also be an SVG file that you’ll want to parse to get some customization info. Specially json->{‘version3.0’}->customizationInfo->surfaces (each surface)->areas. Each area should be a text line or image. At least that is how it is for how we’ve set up products.
As always, put lots of checks, try catches, fallbacks, and error alerts.
The links are time sensitive and expire after 6 months I think.
The links should be a little more complex and if that is exactly the link you are seeing it's incorrect.
You don't require any auth to download them and the easiest way to test them is via the MWS Scratchpad.

How to add first name and email before uploading a video?

Hi guys im brand new and not a developer but I need a way for users when they go to my site they can upload there video and there would be a option for them to add there first name and email so when the video is uploaded the database can keep all the data together.
Ideally I want this as easy as possible for the user and this would just go to our youtube channel or any video platform will work.Any advice would be great!
Please provide more information like what platform are you using ?.
There's more than one way to skin a cat.
The simple way to achieve with web technologies like (Php,node,jave) is maintain the basic user information into the sessions, and whenever it's necessary use this information.
You need to get some knowledge about the system you are using. You particularly need:
access to the server
to know the server type
access to the database
to know the database type
where the relevant files are
After you have gathered all these information, you at least know what you do not know. The next step is to gather information about how you can implement the feature you need. Look at it like at a puzzle with many small pieces. If you are patient-enough, at the end you will resolve the puzzle.

Storing Images in SQL Server database

I'm trying to create a sample ASP.NET MVC application with a ViewModel and onion architecture - very simple online shop.
So as you suppose this shop has products, and each product should have one very small image and when user clicks on that product, he is redirected to a details page, and of course he should see a bigger image of the product.
AT first I thought, it's a simple application, I would (internet) links to the pictures in the database. But then I thought, ok what about when this image is erased from internet, my product will no longer have an image.
So I should store those pictures in the database somehow. I have heard about something called FileStream that is the right way but I found no material to understand what is that.
I hope someone would help me.
There are several options. You could save the picture in the database using a varbinary.
Read here how to read it using MVC.
When you opt for a solution where you split database and file storage, which is perfectly possible, you should consider that it could mean extra maintenance for cross-checking deleted records, etc.
If you choose the last option, the information in the article will mostly suite your needs too.

Where to get an updated list of video games?

I am currently designing a reviews site for video games similar to gamespot am wondering where and if there is an online database that contains information such as name, publisher, release date etc with an API. I dont really want to have to enter each title manually or let users enter the title manually.
Where do these large sites get information like this? I wouldn't think it would be manually. I know for movies IMDB exists.
How would I go about adding it to my database?
Thanks
May I point you to web scraping?
Be sure to read the section legal issues and on well-behaved bots.
There's always Amazon and their product advertising API. Some older, but interesting code snippets can be found on this page.
If you know Perl, there is an amzing module called WWW::Mechanize
Pretty much you can write a script to get to any website and grab any data you need.
So for example you can go to www.gamespot.com, get list like the one below and put them in your database.
http://www.gamespot.com/games.html?platform=1029&mode=all&sort=views&dlx_type=all&sortdir=asc&official=all&tag=games%3Bfooter%3Bmore

How to scrape logos from websites?

First off, this is not a question about how to scrape websites. I am fully aware of the tools available to me to scrape (css_parser, nokogiri, etc. I'm using Ruby to do the scraping).
This is more of an overarching question on the best possible solution to scrape the logo of a website starting with nothing but a website address.
The two solutions I've begun to create are these:
Use Google AJAX APIs to do an image search that is scoped to the site in question, with the query "logo", and grab the first result. This gets the logo, I'd say, about 30% of the time.
The problem with the above is that Google doesn't really seem to care about CSS image replaced logos (ie. H1 text that is image replaced with the logo). The solution I've tentatively come up with is to pull down all CSS files, scan for url() declarations, and then look for the words header or logo in the file names.
Solution two is problematic because of the many idiosyncrasies of all the people who write CSS for websites. They use Header instead of logo in the file name. Sometimes the file name is random, saying nothing about a logo. Other times, it's just the wrong image.
I realize I might be able to do something with some sort of machine learning, but I'm on a bit of a deadline for a client and need something fairly capable soon.
So with all that said, if anyone has any "out of the box" thinking on this one, I'd love to hear it. If I can create a solution that works well enough, I plan on open-sourcing the library for any other interested parties :)
Thanks!
Check this API by Clearbit. It's super simple to use:
Just send a query to:
https://logo.clearbit.com/[enter-domain-here]
For example:
https://logo.clearbit.com/www.stackoverflow.com
and get back the logo image!
More about it here
I had to find logos for ~10K websites for a previous project and tried the same technique you mentioned of extracting the image with "logo" in the URL. My variation was I loaded each webpage in webkit so that all images were loaded from CSS or JavaScript. This technique gave me logos for ~40% of websites.
Then I considered creating an app like Nick suggested to manually select the logo for the remaining websites, however I realized it was more cost effective to just give these to someone cheap (who I found via Elance) to do the work manually.
So I suggest don't bother solving this properly with a fully technical solution - outsource the manual labour.
Creating an application will definetely help you, but I believe in the end there will some manual work involved. Here's what I would do.
Have your application store in a database a link to all images on a website that are larger than a specified dimension so that you can weed out small icons.
Then you can setup a form to access these results. You may want to setup the database table to store the website url and relationship between the url and image links.
Even if it we're possible to write an application to truly figure out if it was a logo or not seems like it would be a massive amount of code. In the end, it would probably weed out even more than the above, but you have to take into account it could be faster for human to visually parse the results then the time it took for you to write and test the complex code.
Yet another simple way to solve this problem is to get all leaf nodes and get the first
<a><img src="http://example.com/a/file.png" /></a>
you can lookup for projects to get html leaf nodes on the net or use regular expressions to get all html tags.
I used C# console app with HtmlAgilityPack nuget package to scrape logos from over 600+ sites.
Algorithm is that you get all images that have "logo" in url.
The challenges you will face with during such extraction are:
Relative images
Base url is CDN HTTP/HTTPS (if you don't know
protocol before you make a request)
Images have ? or & with query
string at the end
With that things in mind I got approximately 70% of success but some images were not actual logos.

Resources