Is there such a thing as a javascript deminifier (deobfuscator)? [closed] - obfuscation

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
This question is exactly the opposite of Which Javascript minifier (cruncher) does the same things that the one Google uses for its JS APIs?
I want to learn how google does it's loading so I can build my own with non-popular JS toolkits.

Try this: JS Beautifier

Try http://www.jsnice.org/
I just stumbled on it and it is great. It expands the code. It has statistical variable renaming. for example, if you have this code:
var g = f.originalEvent.targetTouches[0];
Then it it turns your code into:
var touches = event.originalEvent.targetTouches[0];
Pretty good guess, methinks.
It turned this:
d.slide.hasClass("selected") ? (e.onSlideOpen.call(d.prev.children("div")[0]), q ? (e.rtl && d.slide.position().left > i.w / 2 || d.slide.position().left < i.w / 2) && h.animateSlide.call(d) : t && d.index ? (h.animateGroup(d, !0), h.fitToContent(d.prev)) : d.slide.position().top < i.h / 2 && h.animateSlide.call(d)) : (setTimeout(function() { e.onSlideOpen.call(d.slide.children("div")[0]) }, e.slideSpeed), h.animateGroup(d), t && h.fitToContent(d.slide)), e.autoPlay && (f.stop(), f.play(l.index(j.filter(".selected"))))
Into this:
if (e.slide.hasClass("selected")) {
settings.onSlideOpen.call(e.prev.children("div")[0]);
if (val) {
if (settings.rtl && e.slide.position().left > box.w / 2 || e.slide.position().left < box.w / 2) {
self.animateSlide.call(e);
}
} else {
if (isMac && e.index) {
self.animateGroup(e, true);
self.fitToContent(e.prev);
} else {
if (e.slide.position().top < box.h / 2) {
self.animateSlide.call(e);
}
}
}
} else {
setTimeout(function() {
settings.onSlideOpen.call(e.slide.children("div")[0]);
}, settings.slideSpeed);
self.animateGroup(e);
if (isMac) {
self.fitToContent(e.slide);
}
}
if (settings.autoPlay) {
node.stop();
node.play(tabs.index(options.filter(".selected")));
}
A library I'm working on has a couple of bugs, and after spending hours trying to decipher the code, finding this is going to save me a bunch of time.
Seriously, this tool wipes the floor with JS Beautifier.

Uhhh, it would be impossible to restore variable names unless there was a mapping of minified -> original variable names available. Otherwise, I think the authors of that tool could win the Randi prize for psychic feats.

Chrome Developer tools has this built in

You will not be able to reconstruct method name or variable names. The best you can hope for is a simple JS code formater (like those previously mentioned), and then to go through the file method by method, line by line, working out what each part does.
Perhaps using a good JS refactoring tool would make this easier as well (being able to rename/document methods)

You can use the \b (word boundary) feature in regular expressions to find single-letter variable names in a file.
for i in "abcdefghij..z"; do
sed -i "s/\b$i\b/$(random /usr/share/dict/words)/g" somefile.js
done
You can also use this in vim with something like :%s/\<a\>/truesaiyanpower/g.

To unminify js files, Unminify would be the best!

To unminify css, html and js files, you can use Unminify or Unminify JS online tool!

See our SD ECMAScript Formatter for a tool that will nicely format code.
EDIT: If you want to reverse the renaming process you need something can rename the obfuscated names back to the originals.
This tool can technically do that: SD Thicket ECMAScript Obfuscator.
It does so by applying a renaming map over which you have precise control.
Typically you implicitly construct such a map during the obfuscation process by choosing which names to obfuscate and which to preserve, and the obfuscator applies that map to produce the obfuscated code.
The Thicket obfuscator generates this map as side effect when you obfuscate
in the form essentially of a set of pairs (originalname,obfuscatedname)
for reference and debugging purposes.
Swapping elements gives the map (obfuscatedname,originalname). That inverted map can be applied by Thicket to recover the code with the original names, from the obfuscated code. And the Thicket obfuscator includes the Formatter to let you make it look nice again.
The catch to "reversing minification" (as you put it poorly, you are trying to reverse obfuscation), is that you need the map. Since people doing obfuscation don't give away the map, you, as a recipient of obfuscated code, have nothing to apply. A would-be pirate would have to reconstruct the map presumably by painful reverse engineering.
The "reversing" process also can't recover comments. They're gone forever.
This is all by design, so the real question is why are you looking to reverse obfuscation?

Javascript minifier and Javascript Obfuscator are two different things.
Minifier - removes the comments, unnecessary whitespace and newlines from a program.
Obfuscator - make modifications to the program, changing the names of variables, functions, and members, making the program much harder to understand. Some obfuscators are quite aggressive in their modifications to code.
This Javascript Obfuscator will obfuscate your code.
If you want to deobfuscate your code, try Javascript Beautifier. It will deobfuscate if obfuscation is found and then beautify the code.

Related

Is it possible to override bracket operators in ruby like endpoint notations in math?

Trying to implement something like this:
arr = (1..10)
arr[2,5] = [2,3,4,5]
arr(2,5] = [3,4,5]
arr[2,5) = [2,3,4]
arr(2,5) = [3,4]
Well, we need to override four bracket opreators: [], [), (], ()
Any ideas?
It's called "Including or excluding" in mathematics. https://en.wikipedia.org/wiki/Interval_(mathematics)#Including_or_excluding_endpoints
In short, this is not possible with the current Ruby parser.
The slightly longer answer: You'd have to start by modifying parse.y to support the syntax you propose and recompile Ruby. This is of course not a terrible practical approach, since you'd have to do that again for every new Ruby version. The saner approach would be to start a discussion on ruby-core to see if there is sufficient interest for this to be made part of the language (probably not tbh).
Your wanted syntax is not valid for the Ruby parser, but it could be implemented in Ruby with the help of self-modifying code.
The source files need to be pre-processed. A simple regular expression can substitute your interval expressions with ordinary method syntax, i.e.
arr[2,5] -> interval_closed(arr,2,5)
arr(2,5] -> interval_left_open(arr,2,5)
arr[2,5) -> interval_right_open(arr,2,5)
arr(2,5) -> interval_open(arr,2,5)
The string holding the modified source can be evaluated and becomes part of the application just like a source file on the hard disk. (See instance_eval)
The usage of self-modifying code should be well justified.
Is the added value worth the effort and the complications?
Does the code have to be readable for other programmers?
Is the preprocessing practical? E.g. will this syntax occur in one or a few isolated files, or be spread everywhere?

How to configure generate.py pretty to not break apart comments

How do I configure the generate.py pretty to stop breaking apart comments like this:
// this is a comment
// var a = 10;
after running the generator this becomes:
// this is a comment
// var a = 10;
I can't seem to track down how to stop this in the documentation. Thanks for your help in advance!
Currently, this behavior is not customizable through the config. You either use block comments for those cases, or hack the Python code (I can give you pointers if you want).
In any case you may want to open an enhancement bug for this.

Automatically Generate C Code From Header

I want to generate empty implementations of procedures defined in a header file. Ideally they should return NULL for pointers, 0 for integers, etc, and, in an ideal world, also print to stderr which function was called.
The motivation for this is the need to implement a wrapper that adapts a subset of a complex, existing API (the header file) to another library. Only a small number of the procedures in the API need to be delegated, but it's not clear which ones. So I hope to use an iterative approach, where I run against this auto-generated wrapper, see what is called, implement that with delegation, and repeat.
I've see Automatically generate C++ file from header? but the answers appear to be C++ specific.
So, for people that need the question spelled out in simple terms, how can I automate the generation of such an implementation given the header file? I would prefer an existing tool - my current best guess at a simple solution is using pycparser.
update Thanks guys. Both good answers. Also posted my current hack.
so, i'm going to mark the ea suggestion as the "answer" because i think it's probably the best idea in general. although i think that the cmock suggestion would work very well in tdd approach where the library development was driven by test failures, and i may end up trying that. but for now, i need a quicker + dirtier approach that works in an interactive way (the library in question is a dynamically loaded plugin for another, interactive, program, and i am trying to reverse engineer the sequence of api calls...)
so what i ended up doing was writing a python script that calls pycparse. i'll include it here in case it helps others, but it is not at all general (assumes all functions return int, for example, and has a hack to avoid func defs inside typedefs).
from pycparser import parse_file
from pycparser.c_ast import NodeVisitor
class AncestorVisitor(NodeVisitor):
def __init__(self):
self.current = None
self.ancestors = []
def visit(self, node):
if self.current:
self.ancestors.append(self.current)
self.current = node
try:
return super(AncestorVisitor, self).visit(node)
finally:
if self.ancestors:
self.ancestors.pop(-1)
class FunctionVisitor(AncestorVisitor):
def visit_FuncDecl(self, node):
if len(self.ancestors) < 3: # avoid typedefs
print node.type.type.names[0], node.type.declname, '(',
first = True
for param in node.args.params:
if first: first = False
else: print ',',
print param.type.type.names[0], param.type.declname,
print ')'
print '{fprintf(stderr, "%s\\n"); return 0;}' % node.type.declname
print '#include "myheader.h"'
print '#include <stdio.h>'
ast = parse_file('myheader.h', use_cpp=True)
FunctionVisitor().visit(ast)
UML modeling tools are capable of generating default implementation in the language of choice. Generally there is also a support for importing source code (including C headers). You can try to import your headers and generate source code from them. I personally have experience with Enterprise Architect and it supports both of these operations.
Caveat: this is an unresearched answer as I haven't had any experience with it myself.
I think you might have some luck with a mocking framework designed for unit testing. An example of such a framework is: cmock
The project page suggests it will generate code from a header. You could then take the code and tweak it.

locating all substring instances in a given file

I'm currently working on a function to find all images referenced in an html file, currently I am trying to to find these substrings within the file: ".bmp" ".gif" ".jpg" ".png" and also want to find their roots eg: /images/foo/ and then use these two substrings to make a new string: /images/foo/bar.jpg I know how I am going to concatenate the strings but I have no idea how I am going to locate the actual substrings, I feel quite overwhelmed right now and would really appreciate some help.
The "right" answer to this question ought to urge you to use tools that were built for the job. Smart people write stuff like libxml for a reason. Re-inventing the wheel will only make things more difficult. With libxml, for example, you easily traverse an XML tree like so:
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
if (cur_node->type == XML_ELEMENT_NODE) {
printf("node type: Element, name: %s\n", cur_node->name);
}
The "wrong" answer is to come up with some "trick" for finding the beginning of an image string, either by looking for the beginning of the image tag (<img) or a quote " as Doug mentions in the comments.
You'll notice that I put right and wrong in quotations. I'm somewhat of a purist and would strongly suggest an XML-oriented solution because it's wholly generalizable and easily extendible (tomorrow you may say: oh I also need the anchor text). A DOM parser makes every subsequent problem a breeze to solve.
But if you're working on a proof of concept or prototype (or maybe even homework) where everything's well-formed and you don't release your code in the wild, the "wrong" approach may be sufficient.

Does bracket placement affect readability? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Formatting of if Statements
Is there a best coding style for identations (same line, next line)?
Best way to code stackoverflow style 'questions' / 'tags' rollover buttons
public void Method {
}
or
public void Method
{
}
Besides personal preference is there any benefit of one style over another? I used to swear by the second method though now use the first style for work and personal projects.
By readability I mean imagine code in those methods - if/else etc...
Google C++ Style Guide suggests
Return type on the same line as function name, parameters on the same line if they fit.
Functions look like this:
ReturnType ClassName::FunctionName(Type par_name1, Type par_name2) {
DoSomething();
...
}
WebKit Coding Style Guidelines suggests
Function definitions: place each brace on its own line.
Right:
int main()
{
...
}
Wrong:
int main() {
...
}
They suggest braces-on-same-line for everything else, though.
GNU Coding Standards suggests
It is important to put the open-brace that starts the body of a C function in column one, so that they will start a defun. Several tools look for open-braces in column one to find the beginnings of C functions. These tools will not work on code not formatted that way.
Avoid putting open-brace, open-parenthesis or open-bracket in column one when they are inside a function, so that they won't start a defun. The open-brace that starts a struct body can go in column one if you find it useful to treat that definition as a defun.
It is also important for function definitions to start the name of the function in column one. This helps people to search for function definitions, and may also help certain tools recognize them. Thus, using Standard C syntax, the format is this:
static char *
concat (char *s1, char *s2)
{
...
}
or, if you want to use traditional C syntax, format the definition like this:
static char *
concat (s1, s2) /* Name starts in column one here */
char *s1, *s2;
{ /* Open brace in column one here */
...
}
As you can see, everybody has their own opinions. Personally, I prefer the Perl-ish braces-on-same-line-except-for-else, but as long as everybody working on the code can cooperate, it really doesn't matter.
I think it is completely subjective, however, I think it is important to establish code standards for your team and have everyone use the same style. That being said I like the second one (and have made my team use it) because it seems easier to read when it is not your code.
In the old days we used to use the first style (K & R style) because screens were smaller and code was often printed onto this stuff called paper.
These days we have big screen and the second method (ANSI style) makes it easier to see if your brackets match up.
See HERE and HERE for more information.
First one is smaller in terms of number of lines (maybe that is why development -Java- books tend to use that syntax)
Second one is, IMHO easier to read as you always have two aligned brackets.
Anyway both of them are widely used, it's a matter of your personal preferences.
I use the if statement as something to reason on in this highly emotive subject.
if (cond) {
//code
}
by just asking what does the else statement look like? The logical extension of the above is:-
if (cond) {
//code
} else {
//more code
}
Is that readable? I don't think so and its just plain ugly too.
More lines is != less readable. Hence I'd go with your latter option.
Personally I find the second one more readable (aligned curlys).
Its always easiest for a team to go with the defaults, and since Visual Studio and I agree on this, thats my argument. ;-)
Your lines of code count will be considerably less with the first option. :)

Resources