import wikipedia article using wget or curl (on windows) - batch-file

i have a folder with wikipedia article (XML format).
I want imported files throught the Webinterface (Special:Import). Currently i do it with imacro. But this often hangs and need a lot of resources (Memory) an can only processing one file at once.So i am looking for better solution.
I currently figured out, that in have to login to get an edittoken. This is needed to upload the file.
Read already this. get stuck
To get his run in need two wget/curl "commandlines"
to login and get the edittoken (push user and pwd to form, get edittoken)
push the file to the Formular (push edittoken and content to form)
Building the loop to processing more than one file, i can do by my own.

First of all, let's be clear: the web interface is not the right way to do this. MediaWiki installation requirements include shell access to the server, which would allow you to use importDump.php as needed for heavier imports.
Second, if you want to import a Wikipedia article from the web interface then you shouldn't be downloading the XML directly: Special:Import can do that for you. Set
$wgImportSources = array( 'wikipedia' );
or whatever (see manual), visit Special:Import, select Wikipedia from the dropdown, enter the title to import, confirm.
Third, if you want to use the commandline then why not use the MediaWiki web API, also available with plenty of clients. Most clients handle tokens for you.
Finally, if you really insist on using wget/curl over index.php, you can get a token visiting api.php?action=query&meta=tokens in your browser (check on api.php for the exact instructions for your MediaWiki version) and then do something like
curl -d "&action=submit&...#filename.xml" .../index.php?title=Special:Import
(intentionally partial code so that you don't run it without knowing what you're doing).

Related

Access questions programmatically? [duplicate]

I would like to (programmatically) convert a text file with questions to a Google form. I want to specify the questions and the questiontypes and their options. Example: the questiontype scale should go from 1 to 7 and should have the label 'not important' for 1 and 'very important' for 7.
I was looking into the Google Spreadsheet API but did not see a solution.
(The Google form API at http://code.lancepollard.com/introducing-the-google-form-api is not an answer to this question)
Google released API for this: https://developers.google.com/apps-script/reference/forms/
This service allows scripts to create, access, and modify Google Forms.
Until Google satisfies this feature request (star the feature on Google's site if you want to vote for it), you could try a non-API approach.
iMacros allows you to record, modify and play back macros that control your web browser. My experiments with Google Drive showed that the basic version (without DirectScreen technology) doesn't record macros properly. I tried it with both the plugin for IE (basic and advanced click mode) and Chrome (the latter has limited iMacro support). FYI, I was able to get iMacros IE plug-in to create questions on mentimeter.com, but the macro recorder gets some input fields wrong (which requires hacking of the macro, double-checking the ATTR= of the TAG commands with the 'Inspect element' feature of Chrome, for example).
Assuming that you can get the TAG commands to produce clicks in the right places in Google Drive, the approach is that you basically write (ideally record) a macro, going through the steps you need to create the form as you would using a browser. Then the macro can be edited (you can use variables in iMacros, get the question/questiontype data from a CSV or user-input dialogs, etc.). Looping in iMacros is crude, however. There's no EOF for a CSV (you basically have to know how many lines are in the file and hard-code the loop in your macro).
There's a way to integrate iMacro calls with VB, etc., but I'm not sure if it's possible with the free versions. There's another angle where you generate code (Javascript) from a macro, and then modify it from there.
Of course, all of these things are more fragile than an API approach long-term. Google could change its presentation layer and it will break your macros.
Seems like Apps Script now has a REST API and SDK's for it. Through Apps Script you can generate Google Forms. This API was really hard to find by trying to google for it and I haven't yet tested it myself, but I am going to build something with it today (hopefully). So far everything looks good.
EDIT: Seems like the REST API I am using works very well for fully automated usage.
In March(2022) google released REST API for google form. API allows basic crud operation & also added support for registering watches on the form to notify whenever either form is updated or a new response is received.
As of now (March 2016), Google Forms APIs allow us to create forms and store them in Google Drive. However, Forms APIs do not allow one programmatically modify the form (such as modify content, add or delete questions, pre-filled data, etc). In other words, the form is static. In order to serve custom, external APIs are needed.

Use Apache to Rewrite URLs with Database Parameters as Nice URLs

For years our database driven websites have had URLs that look like:
https://www.example.com/product?id=30
But nowadays, and especially for SEO purposes, we want our URLs to be "nice" and look like:
https://www.example.com/30/myproduct
We use Zope 2.13.x running on Debian and using Apache 2.4 as the front-end webserver. I know that not too many people use Zope, but utilizing Apache's mod_rewrite we should be able to proxy the rewrite and have nice URLs that still pass the database arguments necessary in order to properly serve the pages to the end users.
There used to be a Zope Cookbook where I wrote a bunch of really detailed tutorials on Zope functionality but that no longer seems to exist and I wanted to share this with the SE community.
The awesome thing is that this is not specific to Zope, but will/should work with any rewrite of a parameter based URL into a nice URL and it's super easy once it's all working.
For complete transparency, I am going to answer my own question so that it's documented for everyone.
Using the rewrite engine in Apache, decide how you want your URLs to look to the end user in their web browser.
For example, if you are calling to a database and have a url that looks like
https://www.example.com/products?id=30&product_name=myproduct
but you want that URL to look like
https://www.example.com/products/30/myproduct
you would use a rewrite rule as follows:
RewriteRule ^/products/(.*)/(.*) /products?id=$1&product_name=$2 [L,P,NE,QSA]
To explain that further:
^/products/(.*)/(.*) is saying that anytime domain.com/products is accessed, look for two variables in the next directory names, i.e. /(.*)/(.*)
If you only wanted one variable you would do ^/products/(.*)
Likewise if you wanted three variables you would do ^/products/(.*)/(.*)/(.*)
From there we need to tell Apache how to interpret that URL in order to rewrite and still allow Zope (or whatever db you may be using) to pass the correct URL parameters. That part is:
/products?id=$1&product_name=$2
Apache will now take the first (.*) and treat that as $1. It will take the second (.*) and treat that as $2 and so on.
The part in the brackets is extremely important
L = This makes Apache stop processing the rewrite ruleset if the rule matches. This is important because you don't want Apache to get confused and start trying other rewrites.
P = Proxy the request. This makes sure that the browser does not display a different URL than https://www.example.com/products/30/myproduct (i.e. we do not want the end user seeing the rewritten URL as https://www.example.com/products?id=30&product_name=myproduct
NE = No Escaping any URL characters. You need this to ensure that the URL rewrite does not try and escape the special characters like $ = & as these are important to URL parameters
QSA = This allows multiple variables (or URL parameters) to exist
Please Note: It is very important to consider how you want your URLs to look (the nice URLs) because that is what you want to submit to the search engines. If you change your URL structure, those nice URLs will no longer work and your search engine rankings may decrease.

TagUI RPA Download File From Browser

I want to automatic download file from browser using tagUI.
How can i do this with tagUI? I need Thanks in advance.
Sorry I don't track Stack Overflow for user queries. For issues and questions, raise directly to the GitHub page - https://github.com/kelaberetiv/TagUI/issues
To download files, there are 2 ways.
First way is write the TagUI script to perform the steps as you would normally do to download the file, logging in, clicking whatever you do to download as if you are doing it manually.
Second way is if you are familiar with Python, you can use the download() function to perform the download (provided it is a publicly available URL) - https://github.com/tebelorg/RPA-Python#pro-functions

Hiding the word "joomla" from a script in contact form

Whenever i create a contact form in my Joomla! 3.3.6, some script appears in the the page's HTML code that contains many words Joomla in it. I'd like to change those Joomla words and replace them with another words (i.e. Foo) for some security issue. I'd like to know whether or not i'm able to do so and how.
That script is:
<script>(function(){var strings={"JLIB_FORM_FIELD_INVALID":"\u0641\u06cc\u0644\u062f \u0646\u0627\u0645\u0639\u062a\u0628\u0631:&#160"};if(typeof Joomla=='undefined'){Joomla={};Joomla.JText=strings;}
else{Joomla.JText.load(strings);}})();</script>
I have no idea whether a plugin or an extension creates it or not.
Thank you
Regards
This script seems to be translating some text required for the form to use in its javascript, eg validation messages. It does this using a javascript version of JText, which is part of core Joomla. There is some info on how that works here. Weirdly, there seems to be little information in the official Joomla documentation about it.
The main JText function it is calling appears here: media/system/js/core.js
I'm sure it would be possible to write a plug-in to remove this script before the page is rendered and then to translate any untranslated text with your own scripts. However, I'm not sure I see any security benefit in doing this so it seems a waste of time.
Ultimately, someone sniffing a site for what it is built in is far more likely to see if core files exist by going direct to places like media/system/js/core.js, rather than to scan the code for the word "Joomla" - which would trigger a lot of false-positives (any site which just mentions Joomla) and negatives (any page which doesn't have a form on it). It also does not reveal the version of Joomla, which is the info a hacker would more likely be after.
I think you have to search for the script (i.e via Notepad++) in the whole directory. It must be a plugin for the contact form that has some inline script in it.
also do you use any special third party plugin or so? that might be the source of it.
PS: also i had some similar experience, i don't know exactly how i got rid of those words, but like you, i wanted to do that to hide the fact that i'm using joomla for security.
Its actually Joomla who add this, from the file: Joomlainstall/libraries/joomla/document/html/renderer/head.php
And load it globaly from:
Joomlainstall/libraries/cms/html/formbehavior.php
The developer ad that code by using the function, JText, for an example:
JText::_( 'COM_CONTACT_EMAIL_FORM' )
In my case it was the plugin ContactUs Form who add the javascript. If JText is not used, it is not loaded. If I disabled the plugin, the javascript was then not loaded. If you have that plugin enabled, my be try an other contact form?
For security reson it is bad programming by the developer off Joomla, for sure.

how to transfer wordpress sql/database from local to live

So for backing up any/all my WordPress sites i use a tool called "BACKUP BUDDY" and its
a great tool and all but lately its been really buggy and today finally it went kaboom!
Usually my workflow is that i develop the site on my local machine using WAMP/MAMP.
when done and ready for testing i use the tool, move it to my personal test server to test and when happy and work is approved, i move to the real server.
Since my tool stopped working(uploads half the content) i decided to just do it manually by installing Wordpress first on the real webserver(done), Applying my theme(done),
then exporting the database sql from the local server(done), and thereafter importing it to the real server(done) and the 2xs that ive done it the site comes up blank.(outcome equals major fail!)
im assuming that something has to be changed/done in order for it to work but not sure what.
unlike a normal DB where i can talk to the info as normal, since WP is a CMS im assuming that it ties the info to the domain but again, i dont know how it 100% works...
Any ideas as to what im doing wrong? because as of now, if i cant do it like this, id have to manually create ALL the pages. Plus, if i was going to then move it from my real test server to final real destination then id have to manually redo it all again...
Thanks in advanced.
you aren't doing anything wrong. It sounds like your particular workflow could be as follows.
Upload the contents of the site via FTP
Create & Import the database via PHPMyAdmin, changing any info in wp-config.php
Define the site url, in wp-config.php [See below]
Use a tool to find & replace any hard-coded site-urls that wordpress loves to use. [See below]
Example code:
Define site urls
define('WP_HOME','http://example.com');
define('WP_SITEURL','http://example.com');
Find replace tool
Replace
http://localhost/
with
http://www.your-new-site.com/
That should be it. It's live!
You can export it using phpMyAdmin and then use bigdump to import it. download bigdump from here and make sure you read the first note about the exporting process, found here
http://www.ozerov.de/bigdump/usage/
here is a bash script you can use to automate this entire process for you: https://github.com/jplew/SyncDB

Resources