When you do this:
cp file1 file2
(file2 already exists)
What actually happens behind the scene?
1) Does the content of file1 actually get copied to file2?
2) Or is a new file created with the name file2 (overriding the old one) which has same content of file1?
1) Since you're using "cp", I assume the OS is Linux.
2) On Linux, a "file" is referenced by "inodes". Here are two example files:
$ ls -li 1 2
245728 -rw-r--r-- 1 paulsm users 8 Aug 14 14:52 1
245729 -rw-r--r-- 1 paulsm users 8 Aug 14 14:52 2
$ cat 1
Hello 1
$ cat 2
Hello 2
3) Here is the result after "cp"
$ cp 1 2
$ ls -li 1 2
245728 -rw-r--r-- 1 paulsm users 8 Aug 14 14:52 1
245729 -rw-r--r-- 1 paulsm users 8 Aug 14 14:55 2
$ cat 2
Hello 1
You see:
a) the contents of "1" completely replace "2"
b) there is no "new file" - the inode for "2" remains unchanged from before the copy
c) the file date is changed along with the file contents
'Hope that helps .. PSM
Usually the first. Both an index-entry as well as the file's data are written.
Yet it would help to know on what (file-)system you are (guessing linux flavour).
You would probably be aware if you were creating a junction point or symbolic/hard LINK.
Think of it like this:
Hardlink is a pointer/name, that points to a data; i.e. it's just an alternative filename; it has same inode number as the file it was created from.
Copy obviously, copy of the data; point to a different direction that file it was copyed from; has different inode number.
Also difference is in system calls, but that`s somewhat deep-diving into issue
Related
For pedagogical purposes, I want to set up a basic command injection in C. I have the following code :
#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv) {
char cat[] = "cat ";
char *command;
size_t commandLength;
commandLength = strlen(cat) + strlen(argv[1]) + 1;
command = (char *) malloc(commandLength);
strncpy(command, cat, commandLength);
strncat(command, argv[1], (commandLength - strlen(cat)) );
system(command);
return (0);
}
I compile it, set the binary as owned by root and set the SUID to 1, as follows :
gcc injectionos.c -o injectionos
sudo chown root:root injectionos
sudo chmod +s injectionos
I obtain the following result :
ls -la
total 40
drwxr-xr-x 2 olive olive 4096 Jan 6 13:17 .
drwxr-xr-x 3 olive olive 4096 Jan 6 12:15 ..
-rwsr-sr-x 1 root root 16824 Jan 6 13:17 injectionos
-rw-r--r-- 1 olive olive 415 Jan 6 13:17 injectionos.c
-rwx------ 1 root root 9 Jan 6 12:43 titi.txt
-rw-r--r-- 1 olive olive 9 Jan 6 12:16 toto.txt`
So, basically, with the SUID set to 1, i should be able to open both toto.txt and titi.txt files by performing the following injection :
./injectionos "toto.txt;cat titi.txt"
But executing this command, I got a permission denied when accessing titi.txt. Finally, when I add a setuid(geteuid()); in my code, the injection is working and I can access to titi.txt file.
Given that injectionos is ran as root and titi.txt belong to root, I supposed that it was enough, but apparently no. What am I missing here?
The privileges are being dropped by /bin/sh executed as part of the system() call. See the man page for bash and the -p option
If the shell is started with the effective user (group) id not equal
to the real user (group) id, and the -p option is not supplied, no
startup files are read, shell functions are not inherited from the
environment, the SHELLOPTS, BASHOPTS, CDPATH, and GLOBIGNORE
variables, if they appear in the environment, are ignored, and the
effective user id is set to the real user id. If the -p option is
supplied at invocation, the startup behavior is the same, but the
effective user id is not reset.
Well, technically debian uses dash by default, but it does the same thing.
So the default behavior of the shell has been adjusted to mitigate this injection at least somewhat.
When I do ls -l I get
-rw-r--r-- 1 jboss admin **26644936** Sep 1 21:23 MyBig.war
How do I print it as below
-rw-r--r-- 1 jboss admin **26,644,936** Sep 1 21:23 MyBig.war
The proper way to format ls output is to specify BLOCK_SIZE.
Saying:
BLOCK_SIZE="'1" ls -l
would achieve your desired result.
Quoting from the above link:
Some GNU programs (at least df, du, and ls) display sizes in “blocks”.
You can adjust the block size and method of display to make sizes
easier to read.
A block size specification preceded by ‘'’ causes output sizes to be
displayed with thousands separators.
Using sed:
$ ls_output='-rw-r--r-- 1 jboss admin 26644936 Sep 1 21:23 MyBig.war'
$ echo $ls_output | sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'
-rw-r--r-- 1 jboss admin 26,644,936 Sep 1 21:23 MyBig.war
Above sed command repeatedly replace the last 4 digits #### with #,###.
-e :a: Make a label named a for t command.
ta: Jump to a if substitution was successful.
After I found a file on the disk , I now need to print out all its details , for example :
-rwxr-xr-x 1 1000 1000 8296 2010-01-06 22:29 ./Documents/exer4
-rwxr-xr-x 1 1000 1000 8517 2009-12-30 11:30 ./Documents/os/exer4
lrwxrwxrwx 1 1000 1000 8 2010-01-10 13:10 ./Documents/cs/2012/exer4 -> ../a.out
I need to print a file details without using ls -ln .Any idea how to do that ?
Thanks
You want the stat() function.
Here's a web page that documents *NIX file functions including stat():
http://rabbit.eng.miami.edu/info/functions/unixio.html
You can use this function:
int lstat(const char *path, struct stat *buf);
This works for me from the shell.
ls -l | grep -v "\->"
It simply filters out any line that has a -> in it.
Note however, that if you have any files/directories that for some reason have -> in their names, they will also be filtered out. Having said that, I've never seen that, nor would it be a good idea.
I have a directory I’m archiving:
$ du -sh oldcode
1400848
$ tar cf oldcode.tar oldcode
So the directory is 1.4gb. The file is significantly smaller, though:
$ ls -l oldcode.tar
-rw-r--r-- 1 ieure ieure 940339200 2002-01-30 10:33 oldcode.tar
Only 897mb. It’s not compressed in any way:
$ file oldcode.tar
oldcode.tar: POSIX tar archive
Why is the tar file smaller than its contents?
You get a difference because of the way the filesystem works.
In a nutshell your disk is made out of clusters. Each cluster has a fixed size of - let's say - 4 kilobytes. If you store a 1kb file in such a cluster 3kb will be unused. The exact details vary with the kind of file-system that you use, but most file-systems work that way.
3kb wasted space is not much for a single file, but if you have lots of very small files the waste can become a significant part of the disk usage.
Inside the tar-archive the files are not stored in clusters but one after another. That's where the difference comes from.
Having no knowledge of what tar you're using or what sort of Unix system you're using, here's my guess: oldcode contains numerous smaller files, which when by themselves use disk space inefficiently, since disk space is allocated by some sort of block, rather than byte by byte. In the tar file, they're concatenated, and make maximum use of the disk space they're assigned.
This has something to do with the blocksize of your filesystem. man 1 du on MacOSX 10.5.6 states:
The du utility displays the file system block usage for each file argument and for each directory in the file hierarchy rooted in each directory argument. If no file is specified, the block usage of the hierarchy rooted in the current directory is displayed.
[mirko#borg foo]$ ls -la
total 0
drwxr-xr-x 2 mirko wheel 68 Jan 30 21:20 .
drwxrwxrwt 10 root wheel 340 Jan 30 21:16 ..
[mirko#borg foo]$ du -sh
0B .
[mirko#borg foo]$ touch foo
[mirko#borg foo]$ ls -la
total 0
drwxr-xr-x 3 mirko wheel 102 Jan 30 21:20 .
drwxrwxrwt 10 root wheel 340 Jan 30 21:16 ..
-rw-r--r-- 1 mirko wheel 0 Jan 30 21:20 foo
[mirko#borg foo]$ du -sh
0B .
[mirko#borg foo]$ echo 1 > foo
[mirko#borg foo]$ ls -la
total 8
drwxr-xr-x 3 mirko wheel 102 Jan 30 21:20 .
drwxrwxrwt 10 root wheel 340 Jan 30 21:16 ..
-rw-r--r-- 1 mirko wheel 2 Jan 30 21:20 foo
[mirko#borg foo]$ du -sh
4.0K .
As you see even a file of 2 bytes takes a whole block of 4kb. There are some filesystems which avoid this waste of space by block suballocation.
There are 2 possibilities.
Small files
Most likely, it isn't smaller than its contents. As Nils Pipenbrinck wrote, du displays the amount of space the filesystem allocates, which since files are stored in filesystem blocks is more than the logical size of the file.
To view the logical size of the file, use du --apparent-size. In this case, the result should be smaller than the tar file.
Sparse files
Tar files can store sparse files. If the tarball was created using --sparse, the holes in the sparse files will be recorded, so the tarball could be smaller than the logical size of the files.
If the sparseness information in your extracted copy was somehow lost (e.g. if you extracted the tarball onto a filesystem that doesn't support sparse files, or if it was zipped and then unzipped, etc.), then df will report the expanded size.
du counts disk blocks, not file size duder.
Ok, I have been working with Solaris for a 10+ years, and have never seen this...
I have a directory listing which includes both a file and subdirectory with the same name:
-rw-r--r-- 1 root other 15922214 Nov 29 2006 msheehan
drwxrwxrwx 12 msheehan sysadmin 2048 Mar 25 15:39 msheehan
I use file to discover contents of the file, and I get:
bash-2.03# file msheehan
msheehan: directory
bash-2.03# file msh*
msheehan: ascii text
msheehan: directory
I am not worried about the file, but I want to keep the directory, so I try rm:
bash-2.03# rm msheehan
rm: msheehan is a directory
So here is my two part question:
What's up with this?
How do I carefully delete the file?
Jonathan
Edit:
Thanks for the answers guys, both (so far) were helpful, but piping the listing to an editor did the trick, ala:
bash-2.03# ls -l > jb.txt
bash-2.03# vi jb.txt
Which contained:
-rw-r--r-- 1 root other 15922214 Nov 29 2006 msheehab^?n
drwxrwxrwx 12 msheehan sysadmin 2048 Mar 25 15:39 msheehan
Always be careful with the backspace key!
I would guess that these are in fact two different filenames that "look" the same, as the command file was able to distinguish them when the shell passed the expanded versions of the name in. Try piping ls into od or another hex/octal dump utility to see if they really have the same name, or if there are non-printing characters involved.
I'm wondering what could cause this. Aside from filesystem bugs, it could be caused by a non-ascii chararacter that got through somehow. In that case, use another language with easier string semantics to do the operation.
It would be interesting to see what would be the output of this ruby snippet:
ruby -e 'puts Dir["msheehan*"].inspect'
You can delete using the iNode
If you use the "-i" option in "ls"
$ ls -li
total 1
20801 -rw-r--r-- 1 root root 0 2010-11-08 01:55 a?
20802 -rw-r--r-- 1 root root 0 2010-11-08 01:55 a\?
$ find . -inum 20802 -exec rm {} \;
$ ls -li
total 1
20801 -rw-r--r-- 1 root root 0 2010-11-08 01:55 a?
I've an example (in Spanish) how you can delete a file using then iNode on Solaris
http://sparcki.blogspot.com/2010/03/como-eliminar-archivos-utilizando-su.html
Urko,
And a quick answer to part 2 of my own question...
I would imagine I could rename the directory, delete the file, and rename the directory back to it's original again.
... I would still be interested to see what other people come up with.
JB
I suspect that one of them has a strange character in the name. You could try using the shell wildcard expansion to see that: type
cat msh*
and press the wildcard expansion key (in my shell it's Ctrl-X *). You should get two names listed, perhaps one of which has an escape character in it.
To see if there are special characters in your file, Try the -b or -q options to ls,
assuming solaris 8 has those options.
As another solution to deleting the file you can bring up the graphical file browser
(gasp!) and drag and drop the unwanted file to the trash.
Another solution might be to move the one file to a different name (the one without the unknown special character), then delete the special character directory name with wildcards.
mv msheehan temp
rm mshee*
mv temp msheehan
Of course, you want to be sure that only the file you want to delete matches the wildcard.
And, for your particular case, since one was a directory and the other a file, this command might have solved it all:
rmdir msheeha*
One quick-and-easy way to see non-printing characters and whitespace is to pipe the output through cat -vet, e.g.:
# ls -l | cat -vet
Nice and easy to remember!
For part 2, since one name contains two extra characters, you can use:
mv sheehan abc
mv sheeha??n xyz
Once you've done that, you've got sane file names again, that you can fix up as you need.