Setting up a multi lingual website - subdomains or folders? - multilingual

I'm in the process of creating alternative versions of our website, to be more precise an Italian, French, American English and Spanish versions - with the aim to provide location dependent information for individual locations. There are three setups I can think of but all seem to have their own problems...
Subdomains, so it.website.com, fr.website.com etc. The one bonus with this setup is we could host the subdomains on different servers, and potentially reduce load time by having the American site served by an American server. However in terms of SEO your effectively creating a new website (depending on what you read) so rankings may be more difficult.
Folders, so website.com/it/ etc. The bonus with this setup is that your building on you main websites reputation, so in theory should be easier to rank, interestingly this is the method apple use. However the one downside is the site would all be served from the same server location? So load times maybe affected?
The third option is really the same as the first and has the same advantages, using website.it, website.fr but has an obvious additional outlay for the domains.
Thanks for any help.
Dave

Related

Good (CMS-based?) platform for simple database apps

I need to implement yet another database website. Let's say roughly 5 tables, 25 columns, and (eventually) thousands to tens of thousands of rows. Easy data entry and maintenance are more important than presentation of the data to non-privileged users. It's a niche site, so performance is not a concern. We'll have no trouble finding somewhere to host it.
So: what's a good platform for this? Intituitively I feel that there ought to be some platform that allows this to be done with no code written - some web version of MS Access. Obviously I'm happy to code business rules, and special logic that distinguishes this from every other database app.
I've looked at Drupal (with Views) and it looks possible, but with quite a bit of effort. Will look at Al Fresco next. A CMS-y platform helps because then you can nicely integrate static content, you get nice styling, plugins, etc etc.
Really good data entry (tracking changes, logging, ability to roll back, mass imports...) would be great. If authorised users could do arbitrary SQL queries (yes, I know...) that would be a big bonus. Image management support a small bonus.
Django is what you are looking for. In fact, you could probably set up what you ask without much coding at all, just configuration.
Once complete, authorised users can add 'rows' with a nice but simple GUI, or, of course, you can batch import via database commands.
I'm a Python newbie, and I've already created 2 Django-based sites. I have created more than a dozen Drupal-based sites, and Django is easier and produces significantly faster sites.
Your need somewhat sits between two chairs : bespoke application and CMS-based. I'd advocate for the CMS approach, if and only if you feel the need for content structure customization will grow in the future, slowly removing the need for direct SQL queries.
I am biased since working with eZ Publish for many years now, but it satisfies the requirements you expressed natively :
Really good data entry (tracking changes, logging, ability to roll back, mass imports...)
[...] Image management support a small bonus.
An idea of the content edition feel can be watched here:
http://ez.no/Demos-Videos/eZ-Publish-Administration-Interface-Video-Tutorial
and you can download and test-drive eZ Publish Community Edition there : http://share.ez.no/latest
It is a PHP-based solution, strong professional community (http://share.ez.no), over 1100 add-ons available on http://projects.ez.no. The underlying libs are mostly relying on Apache Zeta Components, high-quality, robust set of PHP5 libraries.
Last note : the content model is abstracted, meaning you'd not have to create a new table everytime a new type of content should be stored : a simple content class definition from the administration interface, and the rest is taken care of, including the edition interface for the new content type. Might remove the need for hardcore SQL queries ?
Hope it helped,
Drupal can do most of what you need (I don't know of a module that will let you enter arbitrary SQL queries), but you will end up with some overhead of tables and modules you don't really need. It's up to you to decide if that's a problem or not. I don't think the overhead would hurt performance in your case.
The advantages of using Drupal would be the large community, the stability of the platform and the flexibility to add more functionality when needed. Also, the large user base ensures that most code has been tested rather well.
I highly recommend Drupal. It is very simple (also internally codebase is small and clean) it has dosens of possibilities and tremendous support. Once you start with Drupal you will never go to anything else.
Note that I'm not connected with Drupal staff, I've just created dosens of Drupas sites and many of them in just a minutes. My last one took me 2 hrs, see it here http://iPadDevZone.com
UPDATE #1:
It really depends on your DB schema complexity. The best case is that you just use CCK module (part of core now) and create your node type. Node is Drupal name for content. All you do is just web admin your node type fields (text, image, numbers, dates, custom, etc). Then, if user creates content with this node type he/she can enter all the fields which are stored in separate db table fields. This is however hidden for you - if you wish not to know about it - it is just a web gui. Then you choose how the node is presented, which properties as shown and where.
Watch videos in CCK resources section in the bottom of this page: http://drupal.org/project/cck
If you need to do some programming then it is also very easy to use so called PHP code sniplets which are entered as part of your content (node) and executed when the page is displayed.
Drupal has node revisions built in the core. You can see all the versions and roll back if you wish.
You can set the permissions in quite granular level so you can control what your users may or may not.
I would take a look at Symphony. I havn't been using it myself, but it seems like it's really easy to use and to customize!
http://symphony-cms.com/
Seems to me an online database system would be better than a CMS system.
So in addition to what's been posted above:
www.quickbase.com (by Intuit) - think around $150/mo
www.rollbase.com - check on price, full featured
www.rhythmdata.com - easy to set up, but don't think it's got the advanced features you're looking for.
Good luck!
B
I appreciate these answers, but most of them are really platforms that are much better at something else (eg, Drupal really is a CMS, and has some support for custom fields - but it's not at all easy). Since this is a brand new site from scratch, it doesn't really make sense to start with something that does custom database fields as an afterthought, I think.
The closest I've found is Zoho Creator. It really is like "MS Access for Web 2.0" - and even supports importing from Access. The pricing could get expensive though. It feels like it might eventually be quite constraining. I'm still evaluating.
Are there any other products like Zoho Creator?

Which web solution should I use for my project?

I'm going to create a fairly large (from my point of view anyway) web project with a friend. We will create a site with roads and other road related info.
Our calculations is that we will have around 100k items in our database. Each item will contain some information like location, name etc. (about 30 thing each). We are counting on having a few hundred thousand unique visitors per month.
The 100k items and their locations (that will be searchable) will be the main part of the page but we will also have some articles, comments, news and later on some more social functions (accounts, forums, picture uploads etc.).
We were going to use Google AppEngine to develop our project since it is really scalable and free (at least for a while). But I'm actually starting to doubt that AppEngine is right for us. It seems to be for webbapps and not sites like ours.
Which system (language/framework etc.) would you guys recommend us to use? It doesn't really mater if we know the language since before (we like learning new stuff) but it would be good if it's something that is future proof.
I think that GAE can do the job. Google claims that Google App Engine is able to handle 5 million visitors for free and you will have to start paying only if you exceed their free quota.
It's also pretty easy to get started. If you don't have experience on administrating websites and choose a regular hosting service, you will have to worry about several things that you don't even imagine now.
My only concern would be with respect of the kind of data and queries you will have to do, since it does not have a relational database. Anyway, there is an open source project for GAE, called GeoModel that gives GAE the ability to do complex geo spacial queries, like proximity fetch. Have a look at their tutorial and the demo app.
About your impression that GAE was intended only for small web apps, there are a couple of CMS that run on it.
Good luck!
If once of your concerns is scalability, and you don't want to depend on expensive or commercial tools, I would recommend that you take a look at this tech stack:
Erlang - A programming language designed for concurrency and distribution.
Nitrogen - An Erlang web framework with a lot of cool stuff, like transparent AJAX.
NoSQL scalable databases, such as CouchDB or Riak - Save the the hassle of SQL code and are more scalable than plain MySQL. Both has direct native Erlang API.
To be honest, I don't know if this tool set is your cup of tea; These are not mainstream solutions. I just suggest these to everyone who ask about size-sensitive web applications.
All serious web frameworks will provide you with what you need. The real issues (for example scalability) might be tackled in a different way depending on what you use, but you wont be limited if you choose a well-known one. The choice of database system might be more important for that (sql vs nosql), even if both of those will do fine too.
It's all about
knowing how to use
enjoying to use
the tool(s) you've chosen.
In either case, name-dropping some suggestions:
Rails (Ruby)
Django (Python)
Nitrogen (Erlang)
ASP.NET MVC (C#)
And please note, if you really want to learn everything from the bottom, you'd be fine with any of these (or one of the other gazillion out there). But if you want to perform your best, choose one that supports a language you know well or uses techniques/tools you have experience of etc. Think twice about how you value this is fun and we learn a lot against we want to be productive and do a really good job.

How to best deploy a single Google App Engine application to multiple region-specific subdomains?

I am trying to figure out the best way to deploy a single Google App Engine application across multiple regions.
The same code is to be used, but the stored data is specific to each region. Motivating examples are hyperlocal review sites, like yelp.com or urbanspoon, where restaurants and other businesses to review are specific to a region (e.g. boston.app.com, seattle.app.com).
A couple options include:
Create multiple GAE applications,
and duplicate the code across them.
Create a single GAE application, and store all data for all regions
in the same Datastore, with a region
identifier field for each model
delimiting the relevant region.
Some of the trade-offs:
Option 2 seems like it will be increasingly inefficient (space: replicating a region identifier for each record of every model; time: filtering/indexing on the identifier for every query).
Option 1 requires an app ID for every region, while GAE only allows 10 apps per account. Moreover, deploying the code across every region, as well as Datastore migration, seems like it could be a pain to manage.
In the ideal world, I would have a single application instance. From that instance, I could route between subdomains (like here), as well as have a separate Datastore for each subdomain. But I believe GAE only allows a single datastore per application.
Does anyone have ideas on the best way to solve this problem? Or options that I am not considering?
Thanks for your time!
I would recommend your approach #2. Storage space is cheap (and region codes are short), and datastore performance does not degrade with size, unlike most databases. Using a single app also makes for easier management and upgrades, and avoids any issues with the TOS (which prohibit sharding your app to avoid billing charges).
If you use source code revision control, then it is not too bad to push identical code into multiple apps. You could set a policy whereby only full-fledged tags are allowed to be pushed up to GAE. Another option is to make your application version the same as the revision number.
With App Engine, I (and I believe most others) always migrate data from within my model code. You can't easily do bulk migrations in GAE and the usual solution is to migrate data as you come across it in code. In this way, you can keep your models pretty much identical across applications.
Having said that, I would probably still go with a unified application. It's more future-proof. What if users want to join their L.A. identity and their New York identity? Or what if an advertiser offers you a sweet deal for you to run some marketing reports on your own data?
Finally, a few bytes of data doesn't matter so much on App Engine. As your site grows, you will very quickly discover that you will always be bumping into ceilings. GAE limits are extremely small compared to a traditional web server and so you will have to work within those limits anyway. For example, you can only fetch 1,000 records at a time. So your architecture will already support a piecemeal paging solution. So don't worry too much about an extra field or two in your record.

Looking for an example of when screen scraping might be worthwhile

Screen scraping seems like a useful tool - you can go onto someone else's site and steal their data - how wonderful!
But I'm having a hard time with how useful this could be.
Most application data is pretty specific to that application even on the web. For example, let's say I scrape all of the questions and answers off of StackOverflow or all of the results off of Google (assuming this were possible) - I'm left with data that is not very useful unless I either have a competing question and answer site (in which case the stolen data will be immediately obvious) or a competing search engine (in which case, unless I have an algorithm of my own, my data is going to be stale pretty quickly).
So my question is, under what circumstances could the data from one app be useful to some external app? I'm looking for a practical example to illustrate the point.
It's useful when a site publicly provides data that is (still) not available as an XML service. I had a client who used scraping to pull flight tracking data into one of his company's intranet applications.
The technique is also used for research. I had a client who wanted to compare the contents of several online dictionaries by part of speech, and all of these sites had to be scraped.
It is not a technique for "stealing" data. All ordinary usage restrictions apply. Many sites implement CAPTCHA mechanisms to prevent scraping, and it is inappropriate to work around these.
A good example is StackOverflow - no need to scrape data as they've released it under a CC license. Already the community is crunching statistics and creating interesting graphs.
There's a whole bunch of popular mashup examples on ProgrammableWeb. You can even meet up with fellow mashupers (O_o) at events like BarCamps and Hack Days (take a sleeping bag). Have a look at the wealth of information available from Yahoo APIs (particularly Pipes) and see what developers are doing with it.
Don't steal and republish, build something even better with the data - new ways of understanding, searching or exploring it. Always cite your data sources and thank those who helped you. Use it to learn a new language or understand data or help promote the semantic web. Remember it's for fun not profit!
Hope that helps :)
If the site has data that would benefit from being accessible through an API (and it would be free and legal to do so), but they just haven't implemented one yet, screen scraping is a way of essentially creating that functionality for yourself.
Practical example -- screen scraping would allow you to create some sort of mashup that combines information from the entire SO family of sites, since there's currently no API.
Well, to collect data from a mainframe. That's one reason why some people use screen scraping. Mainframes are still in use in the financial world and often it's running software that has been written in the previous century. The people who wrote it might already be retired and since this software is very critical for these organizations, they really hate it when some new code needs to be added. So, screenscraping offers an easy interface to communicate with the mainframe to collect information from the mainframe and then send it onwards to any process that needs this information.
Rewrite the mainframe application, you say? Well, software on mainframes can be very old. I've seen software on mainframes that was over 30 years old, written in COBOL. Often, those applications work just fine and companies don't want to risk rewriting parts because it might break some code that had been working for over 30 years! Don't fix things if they're not broken, please. Of course, additional code could be written but it takes a long time for mainframe code to be used in a production environment. And experienced mainframe developers are hard to find.
I myself had to use screen scraping too in a software project. This was a scheduling application which had to capture the output to the console of every child process it started. It's the simplest form of screen scraping, actually, and many people don't even realize that if you redirect the output of one application to the input of another, that it's still a kind of screen scraping. :)
Basically, screen scraping allows you to connect one (web) application with another one. It's often a quick solution, used when other solutions would cost too much time. Everyone hates it, but the amount of time it saves still makes it very efficient.
Let's say you wanted to get scores from a popular sports site that did not offer the information available with an XML feed or API.
For one project we found a (cheap) commercial vendor that offered translation services for a specific file format. The vendor didn't offer an API (it was, after all, a cheap vendor) and instead had a web form to upload and download from.
With hundreds of files a day the only way to do this was to use WWW::Mechanize in Perl, screen scrape the way through the login and upload boxes, submit the file, and save the returned file. It's ugly and definitely fragile (if the vendor changes the site in the least it could break the app) but it works. It's been working now for over a year.
One example from my experience.
I needed a list of major cities throughout the world with their latitude and longitude for an iPhone app I was building. The app would use that data along with the geolocation feature on the iPhone to show which major city each user of the app was closest to (so as not to show exact location), and plot them on a 3D globe of the earth.
I couldn't find an appropriate list in XML/Excel/CSV type format anywhere easily, but I did find this wikipedia page with (roughly) the info I needed. So I wrote up a quick script to scrape that page and load the data into a database.
Any time you need a computer to read the data on a website. Screen scraping is useful in exactly the same instances that any website API is useful. Some websites, however, don't have the resources to create an API themselves; screen scraping is the developer's way around that.
For instance, in the earlier days of Stack Overflow, someone built a tool to track changes to your reputation over time, before Stack Overflow itself provided that feature. The only way to do that, since Stack Overflow has no API, was to screen scrape.
The obvious case is when a webservice doesn't offer reverse search. You can implement that reverse search over the same data set, but it requires scraping the entire dataset.
This may be fair use if the reverse search also requires significant pre-processing, e.g. because you need to support partial matching. The data source may not have the technical skills or computing resources to provide the reverse search option.
I use screen scraping daily, I run some eCommerce sites and have screen-scraping scripts running daily to gather product lists automatically from my suppliers wholesale sites. This allows me to have upto date information on all the products available to me from several suppliers and allows me to flag non-economical margins due to price changes.

What's the best way to make a mobile friendly site?

Speaking entirely in technology-free terms, what is the best way to make a mobile friendly site? That is, I want to make a site that will work on a regular computer but also have mobile versions of the pages. Should I rewrite each page? The pages will probably have different functionality, so should I rewrite the backend code? Should it be an effectively different site with the same database?
On my site, I detect user agent, and for known mobile browsers I serve a different stylesheet, with some larger/less necessary items left off some pages. The backend doesn't really change.
I added a mobile presentation layer to an operational site about a year ago. Based on the architecture of the site (hopefully this isn't too technology dependent for you) I added a new set of JSPs to accommodate mobile browsers (sidenote: see http://wurfl.sourceforge.net/ for a great way to build mobile pages independent of browser type). Additionally some of the back-end functionality was changed due to the limited functionality of most mobile browsers. So, in short, the integration wasn't as painful as one would expect.
Good luck!
This is a pretty broad question, but here goes:
If the site is primarily about the content, meaning it's not so much a service you use as it's a publication you read, then I'd try to avoid publishing two sites wherever possible. Concentrate on simple presentation using mature technologies that mobile browsers can handle fairly well.
If it's essentially a software application delivered via the network, then things get trickier, because you're going to want to consider the UI of the mobile device, and how it differs from the desktop.
This should go without saying, but either way, if you have many mobile users, you should keep that in mind when you author content for the site. Formats, length, voice, etc.
In addition to the WURFL / WALL capabilities system that todd mentioned, there are Java Server Faces libraries available that use alternate WML renderkits for mobile phones.
One way I have done it in the past was to make sure my data was abstracted well in the data tier and then use separate middle tier models to pull what was appropriate. In my case the application was a weather application and the display methods of the target devices were really limited so we opted to only show the user the essentials on the mobile devices while the website was full featured. That was probably 10 years ago when WAP was big. But these days with devices getting bigger screens, better bandwidth, you may want to consume and display the exact same data with a different view model.
I never really know what type of application will need to consume the data in the future. We do a lot of apps across platforms but the domain model rarely changes. So I end up using the same middle tier objects where I can and pulling that data in different clients. A good example of this is a recent project where we had a rich internet application (widget), a full website, and a web service consuming the same data. Data abstraction in the middle-tier really shines in this environment.
On a very high level of abstraction, there are two main caveats with mobile devices: (1) their screen is small, (2) their network connection is intermittent. This basically means that your need to present the content so that it looks fine even on a small (variable size) screen, and preferably make it cacheable too so that your users can browse the content while offline. Then there's also the problem of low bandwidth and high latency, but those are slightly less important nowadays.
This is a very thorough overview of how to make a site mobile, though i hope its fair to say that there will always be different requirements for anyone seeking to go mobile. If you have a Blog, then you could just as easily make it mobile friendly using Mippin Mobilizer; its free, provides branding customisation tools, and with a big audience already browsing a wide mix of mobilized content, there's opportunities to generate advertising revenue around your blog.
This is because the Mippin Mobilized blog then becomes part of a much wider community of content, people, news, blogs, listings, all connecting around content, and much more at the mobile site:
http://mippin.com (on a mobile browser.)
Take a look at the Mobilizing tool because it shows off what the site can do in a second:
www.mippin.com/mobilizer
Only if you have a blog of course...

Resources