Unfortunately I ran full data import with checked clean index option. I was able to copy whole index to backup directory before they were deleted (I killed solr), but segments.gen and segments_N files were already updated, so any time I copy back index to its origin directory, all index files are deleted on startup of Solr.
I think it is deleted because segments files does not contain my index files information - because segment files point to "after clean" index files.
I tried to somehow reconstructed segments files, but was unlucky and I also did not find way how to do it with solr code change.
Is there any possibility to do it?
I would guess that it is unlikely that the segment_N and segments.gen files were the only things lost, by the sound of it, but you can try using CheckIndex.
You can run it from a command line something like:
java -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex path/to/index -fix
Or you can invoke methods of it in your own implementation, something like:
Directory directory = FSDirectory.open(new File("path/to/index"));
CheckIndex check = new CheckIndex(directory);
CheckIndex.Satus status = check.checkIndex();
check.fixIndex(status);
Related
I use the shake build system and watch to keep files for a web site current. I cannot understand how to assure that an index file for a directory is changed when a file in the directory changes.
The index file for a directory lists all the files in it with their title and includes a link. From this, a HTML file is eventually produced by the shake process.
It requires reconstruction when one of the indexed files in the directory changes.
For each index file, the set of files indexed are marked as needed but this does not force the index file to be reconstructed when a file in the directory changes. I had expected that this would trigger the reconstruction of the index file if any of the needed files changes. This seems not to be a correct understanding.
What is the most effective method, to force a re-shake of the index file when a file in the directory changes. Is it sufficient to touch the index file to trigger the reconstruction? Or is it better to recompute the conversion of the index.md file to the next step (pandoc) and the following processing steps are then triggered by the shake logic? Or anything else?
Using Shake, to create an mp3 (this is just a learning example), I use lame, and then id3v2 to tag it.
If the lame succeeds, but the id3v2 fails, then I'm left with the mp3 file in place; but of course it is "wrong". I was looking for an option to automatically delete target files if a producing command errors, but I can't find anything. I can do this manually by checking the exit code and using removeFiles, or by building in a temporary directory and moving as the last step; but this seems like a common-enough requirement (make does this by default), so I wonder if there's a function or simple technique that I'm just not seeing.
The reason Make does this by default is that if Make has a partial incomplete file on disk, it considers the task to have run successfully and be up to date, which breaks everything. In contrast, Shake records that a task ran successfully in a separate file (.shake.database), so it knows that your mp3 file isn't complete, and will rebuild it next time.
While Shake doesn't need you to delete the file, you might still want to so as to avoid confusing users. You can do that with actionOnException, something like:
let generateMp3 = do cmd "lame" ... ; cmd "id3v2" ...
let deleteMp3 = removeFile "foo.mp3"
actionOnException generateMp3 deleteMp3
I was working on kinda exploration of File Allocation Table recovery last couple of weeks. My purpose is to locate a possibly deleted file by its signature (for example, ZIP file by "50 4B 03 04" bytes) and recover the whole thing to search inside of it.
I've explored there's a problem with FAT: file system uses allocation table indicies for both cluster chain storing and deleted files marking, making files recovery, at first sight, impossible.
But there's hell of a recovery software advertising promising recovery of files deleted from FAT file system. So, there might be a workaround, I assume.
I've found that we can successfully recover files continuously located on disk. First cluster gives us an index, and index address value gives us strong possiblity of finding a directory entry where file size is stored. But is it the end? I'd like to recover fragmented files as well, but can't find the way.
May anyone know a workaround and help me here a bit, please?
FAT file system uses a directory entry for each file and folder. It shows starting cluster, filename, date and size. To access file, system looks in directory finds file and notes the starting cluster. Then it goes to the FAT (file allocation table) cluster that corresponds to the starting cluster. The starting cluster entry contains the cluster number of the next cluster. The next cluster entry points to the next cluster and so on until you come to an end of file marker which means this is the last cluster used by the file.
When you delete a file or folder. It locates the directory it resides in and changes the 1st letter of the file or folder name entry to E6 hex (not sure if its E6 or something slightly different) and it deletes the FAT chain.
That is why you can recover only contiguous files in FAT system once a file is deleted. All data recovery utilities will use this method. None other available unless you can find traces of the FAT with correct cluster chains still in place.
In Clearcase, I want to copy (fork, split) a file while preserving its history. Something like svn cp old.txt new.txt. How do I do it?
It isn't possible do fork a file in ClearCase.
If you refactor your code and split a file in two, one of them will appear as a new file and you will loose the information about who coded it. The annotate command will say the author of the lines are who splited it.
UCM or not, you cannot duplicate easily the full history of a file.
The best way to isolate an history is still to create a branch in order to make new versions to that file without impacting the same file in the original branch.
Thinking 'svn cp' should be available in ClearCase might come from the fact that, in SVN, branches are directories, and a tool like cc2svn will actually replicate ClearCase branches using 'svn cp'.
But since, with ClearCase, branches are first-class citizen, it is best to reason in term of branch than in term of copy/fork.
From the main page of cc2svn:
There is a difference in creating the branches in ClearCase and SVN:
SVN copies all files from parent branch to the target like: svn cp branches/main branches/dev_branch
ClearCase creates the actual branch for file upon checkout operation only.
Pretty simply done
Check out parent folder
Move element you wish to duplicate to appropriate location (not within the checked out parent folder)
Undo Checkout of parent folder
All the files get returned to the original folder with history and also the duplicate ones remain in the new location with the history too. Now each file can be checked out and changed individually
In c#, given a folder path, is there a way to get the last modified file without getting all files?
I need to quickly find folders that have been updated after a certain time and if the file that was last modified is before the time, i want to skip the folder entirely.
I noticed that folder's last modified time does not get updated when one of its file get updated so this approach does't work.
No, this is why windows comes with indexing to speed up searching. The NTFS file system wasn't designed with fast searching in mind.
In any case you can monitor file changes which is not difficult to do. If it is possible to allow your program to run in the background and monitor changes then this would work. If you needed past history you could do an initial scan only once and then build up your hierarchy from their. As long as your program is always being ran then it should have a current snapshot and not have to do the slow scan.
You can also use the Window Search itself to find the files. If indexing is available then it's probably as fast as you'll get.
Try this.
DirectoryInfo di = new DirectoryInfo("strPath");
DateTime dt = di.LastWriteTime;
Then you should use
Directory.EnumerateFiles(strPath, "*.*", SearchOption.TopDirectoryOnly);
Then loop the above collection and get FileInfo() for each file.
I don't see a way how can you get the modified date of a file w/o getting reference to FileInfo() on that file.
I don't think FileInfo will get this file as far as I know.