Patterns for replicating user data from one software to another - database

I have a website that I've integrated with a popular forum software (phpBB).
I have it setup so users that login to the main site automatically are logged in to the forum software as well. I do this by authenticating through the forum's API at the very same time.
When someone registers for the site, an entry goes in to the main site database and an entry goes in to the forum user database (using the forum API).
The primary id of the forum user table is stored in a column in the main site user DB. This is saved at the time of registration: the registration process first creates a forum user, then passes back the ID in to the query that creates the user in the main site.
When a user logs in, if they authenticate with the main site, that ID is pulled and passed in to the forum login API to login the correct person.
However, a weird thing seems to happen randomly: one in every 30 or 40 people that registers ends up with a forum user id that is not their own in the main site user table. I know how to look for these problems and fix them on case by case basis and have scripts in place to do so, but that seems like more of a bandaid, not a fix.
Is this a common problem when linking data like this, or does this seem like something more specific with the software? Because of the randomness of this issue its been hard to debug.

I would suspect Session Management. Are you intentionally or unintentionally reusing session ids?

I've done something similar with vbulletin, by directly using the forum's mysql database to autheticate the main site, and other sites (they're all on the same machine)...
In your case, I would add the site-specific fields that are not in phpBB database in the site's db, and link it to phpbb by user_id... It could be one form on the main sites that inserts into the two databases (some in the main site db, others in phpBB db - with some more privileges fields), I'd use my own non-standard captcha like generating a distorted image "what is x+y" with x and y as random numbers and + may be replaced by other operations, or an image of "type the word ORANGE", or "type your username again"
I would disable the default phpBB registration... there are so many bots that know how to use it...
This would guarantee you have one source for the info, and you fill all the info at once.

Related

Whats happend if i use the same database to connect two websites?

I have a websites agriculture related with info's.
The script is very limited and do not allow me to create a classified section to my website, so, i need to create another one.
For the new website (created on subdomain, but with the same script), if i use the same database, user accounts will be kept?
In fact, that's what interests me. My users from the info's site automatically keep their accounts on the classified site, without the need for another account.
I just want to keep user accounts and nothing else.
enter image description here
The database should keep all of the data in it regardless of how many sites its connected to, however all of said sites can now see and edit it.

DNN 9 restrict a logged in user to a single session at any one time

I want to be able to make it so a registered user can only be logged into the DNN site from one device/browser at any one time.
I understand that the DNN core doesn't support sessions but does have a a users online table which is checked by the scheduler, however i have been unable to find anything available to use this method.
The main purpose is to stop a paid user from sharing their login details with multiple people and thereby diluting the potential revenue to the site. I would think this was not a unique use case and someone must have dealt with this previously.
Open to any and all ideas including commercial modules.
I suppose that you could create a custom login module, and reject logins from a user who appear as active in the UsersOnline table.
I haven't looked around to see what methods are available, but the old usersonline module should provide some hints.

Managing user accounts for different sites with one database

We now have one site running but we will need to build a branded site for our client soon. The client site will have exactly the same data set as our current site expect for the user data. The client site must have totally separated user info which allows only the client to use the site.
I don't see the need for setting up a new database or creating a new user table for the client. My tentative solution is add a "Company" column for the user table so that I can differ which site the user data row is on.
I do not know if this approach will work or not or if it is the best practice. Could anyone with experience like this shed some light on this question?
Thanks,
Nigong
P.S. I use LAMP with AWS.
Using an extra column to store a company / entity id is a common approach for multitenant system. In general you will want to abstract the part that that verifies you can only retrieve data you're allowed to a piece that all queries go through, like your ORM. This will prevent people new to the project from exposing/using data that shouldn't be exposed/used.

Evernote users in the application database

What's the best practice or the common way of keeping (or not keeping) Evernote users in your application's database?
Should I create my own membership system and create a connection to Evernote accounts?
Should I store Evernote user data (or only part of it) in my own app and let the user log in only with Evernote?
Summary: you must protect their data but how you protect it is up to you. Use the integer edam_userId to identify data.
I think the API License agreement covers protection in the terms:
you agree that when using the API you will not, directly or indirectly, take or enable another to take any of the following actions:...
1.8.4 circumvent or modify any Keys or other security mechanism employed by Evernote or the API;
If you cache people's data and your server-based app lacks security to prevent people looking at other's data, then I think you're pretty clearly violating that clause. I think it's quite elegantly written!
Couple that with the responsibility clause 1.2
You are fully responsible for all activities that occur using your Keys, regardless of whether such activities are undertaken by you or a third party.
So if you don't protect someone's cached data and another user is able to get at it, you're explicitly liable.
Having cleared up the question of your obligations to (as you'd expect) protect people's data, the question is how do you store it?
Clause 4.3 covers identifiers pretty directly although it's a bit out of date now that we are all forced to use oAuth - there are no passwords ever entered into anything other a web view. However, mobile or desktop client apps must provide a mechanism for the user to log out, which must completely remove the username and password from your application and its persistent storage.
For a web app, you can't even save the username: If your Application runs as an Internet service on a multi-user server, you must not ask for, view, store or cache the sign-in name or password of Evernote user accounts.
The good news is that you can rely on the edam_userId value which comes back to you in the oAuth token credentials response, as discussed here.
When you look at the Data Model, you can see the unique id under the User and going into the User struct, see the reassuring definition The unique numeric identifier for the account, which will not change for the lifetime of the account.
Thinking about the consequences, as you can't get the user id until you have logged into the service, if you want to provide a local login for people you will have to link your local credentials to the user id. That may irk some people if they have to enter a username twice but can't be helped.
You can allow users to log-in via OAuth. Here's a guide on how that process works.
But you'll probably also want to store a minimal amount of user data, at least a unique identifier, in your database so you can do things like create relationships between the user and their notebooks and tags. Refer to the Evernote data model for those relationships. If you're using rails, this will also help you take advantage of rails conventions.

How best to screen scrape a password protected site on behalf of a 3rd party?

I want to write a program that analyzes your fantasy baseball team and notifies you of recommended actions, possibly multiple times per day. The problem is, you aren't playing fantasy baseball on my site, you're playing on yahoo, or cbs, or espn, etc.
On the majority of these sites, fantasy teams and leagues are not public, so you must be logged in and a member of the league to see the teams in the league.
All that I need is the plain html for the team page on each of those sites to be sent to my server, where I can then parse and analyze the file and send user notifications.
The problem is that I need username/password combinations to easily get this data to my server when I need it, and I think there will be a lot of people who wouldn't want to entrust their yahoo/espn/cbs password to me.
I have come up with several possible ways to solve this problem:
The most obvious way is to ask for their credentials for the site on which their team is hosted. Then I could just programmatically log in and request the data I need. I'm guessing a number of people would be comfortable giving me their credentials, and a number of them not so much.
Write a desktop client, which the user then downloads. The client would require their credentials, but it could then basically do exactly the same thing that the server based version would do, log in, request the page, and send the page back to my server. The difference being that their password would never need to leave their desktop. Their computer would need to be on, and this program running for this method to work.
Write browser add-ons that navigate to the page I need, use the cookie that is saved from a previous login to login to the site, and send the page back to my server. This doesn't require my software to ever ask for their password, but if the cookie expires I am hosed, and I don't know much about browser add-ons besides.
I'm sure there are other options, but these are what I've come up with so far.
I have two questions:
1. What are the other possibilities for this type of task?
2. Am I over-estimating people's reluctance to give me their yahoo (for example) password? Is option (1) above the obvious choice?
It was suggested in the comments that I try yahoo pipes, and that looked like a promising suggestion so I explored it a bit. Having looked now at this, I don't think that is an option. So, it looks like I'll be going with option 1.
This is a problem I grappled with a couple of years ago when I wanted to do the same thing. Our site is http://benchcoach.com and the options we were considering were the following:
Original we considered getting the user's credentials and login. We would then log in and scrape their league and team info. The problem there is that after reading several of the various terms of service, this would definitely be violating the terms of service. On top of this, Yahoo! was definitely one of the sites we were considering and their users have email (where we could get access to sensitive data), and Yahoo! wallet. In addition, it would be pretty trivial for Yahoo/ESPN/CBS to block our programmatic logins by IP Address.
The solution we settled on (not 100% happy but it does seem to work) was asking our users to install a bookmarklet (like delicious, digg or reddit) which would post the current html page to our servers, where we could parse the data and load our database. If they were still logged into their Yahoo/ESPN/CBS account, we would direct them directly to the pages, otherwise, those sites would prompt for authentication. Clicking the bookmarklet once more, would post the page to our servers.
The pros of this approach was that we never collected anyone's credentials so any concern of security would have been alleviated. Secondly, it would make it impossible for Yahoo/ESPN/CBS to block access to our service since we would never be connecting directly to their servers but rather the user's browser would be posting the contents of their browser to our server.
The problems with this is that it takes 2 clicks to post a page to our site. For head to head leagues, we needed 3-4 pages so it would take our user 6-8 clicks to sync their league to our servers. We're still looking at options for this.
One important note is that I ran into the product manager of the Yahoo Fantasy Football site at a conference a year ago. We talked about how we were getting the Yahoo data, and he confirmed that getting credentials would violate their TOS and they may stop us. While I don't think they would have, it would have made it hard to invest time and energy to develop this only to have them block our site and pissing of users by closing their accounts.
A potentially more complicated answer could possibly be done with (for example) yahoo pipes.
Hypothetically, you create a pipe which prompts the user for their credentials and provides them with a url which contains their scraped data. They enter this URL in their site, and never have to provide their credentials directly. Even better, for the security-conscious, it would be possible to examine what the pipe was actually doing before entering any information.
The downside would be increased complexity (as well as you'd have to write and maintain the pipe). Having said that, you could provide a link directly to the published pipe from your site, to make things as easy as possible.
Option 1 is the obvious choice. People who trust your site will provide the details. There is no other way you can login to other site while screen scraping.

Resources