I have to implement a installer which has a lot of files and also a huge directory structure. And to implement all of the required components manually, I have to spend a lot of time in it.
Is there any other, shorter way to accelerate it?
You can use Wix Harvest Tool (Heat) for automatic .wxs composing. Usage (from community tutorial):
heat dir folder -cg SampleGroup -out SampleGroup.wxs
will harvest the folder specified recursively, creating a Fragment consisting of a
ComponentGroup with the name specified with the -cg switch. The group
will contain as many components as there are files, each component
having a single file inside, as the rules dictate, and will assign
uniquely generated component, directory and file identifiers that
remain the same when regenerated on the same input set. GUIDs will not
be generated (only a placeholder text) unless explicitely instructed
to do so by the -gg switch.
There are also tons of answers on SO, just search by 'heat' keyword.
Related
I manage a team of designers working on Indesign.
When we work on a project, it often happens that a designer has to work on the project of another. We work with Dropbox for Business.
But when we take the work of another designer, there is often missing links and fonts.
Is there a plugin or a way to develop a plugin that would allow, when we create a new indd file (or for the protection of the same file):
Automatically create a "Links" folder and another "Document fonts" at side of the indd file
Systematically add a new link or new typography in the corresponding folder?
To simplify: each action on font or on a link, make a kind of "Indesign Package" in real time?
If this is not a solution, do you have any solutions to meet this need?
I don't know of a specific script or plugin that does this.
However, it should be possible to write a script with an eventhandler with a beforeClose event that runs certain script commands every time a user closes a document (or even every time a user adds, changes or deletes a link). At this point the script could run some copyLink commands on all the images and fonts (?) placing them all in the folders next to the document.
The whole script could be made a startup script, so it becomes active anytime any user runs InDesign.
(I'm actually not sure, if fonts can be copied so easily. Worst case scenario would be that the script would need to run some packaging command to gather the fonts somewhere, copy them over to where you need them and then delete the rest of the temporary package.)
Did you consider Creative Cloud Libraries ? They are meant to allow sharing assets within a team. Apart form that, you users would need to have a same access to the file system (a common drive letter for the network path for example).
Another solution would be to use a DAM solution so users would link files from the DAM.
Eventually, you could sure think of a script as mdomino offered.
Before I begin, I would like to express my appreciation for all of the insight I've gained on stackoverflow and everyone who contributes. I have a general question about managing large numbers of files. I'm trying to determine my options, if any. Here it goes.
Currently, I have a large number of files and I'm on Windows 7. What I've been doing is categorizing the files by copying them into folders based on what needs to be processed together. So, I have one set that contains the files by date (for long term storage) and another that contains the copies by category (for processing and calculations). Of course this doubles my data each time. Now I'm having to create more than one set of categories; 3 copies to be exact. This is quadrupling my data.
For the processing side of things, the data ends up in excel. Originally, all the data was brough into excel. Then all organization and filtering was performed in excel. This was time consuming and not easily maintainable over the long term. Later the work load was shifted to the file system itself, which lightened the work in excel.
The long and short of it is that this is an extremely inefficient use of disk space. What would be a better way of handling this?
Things that have come to mind:
Overlapping Folders
Is there a way to create a folder that only holds the addresses of a file, rather than copying the file. This way I could have two folders reference the same file.
To my understanding, a folder is a file listing the memory addresses of the files inside of it, but on Windows a file can only be contained in one folder.
Microsoft SQL Server
Not sure what could be done here.
Symbolic Links
I'm not an administrator, so I cannot execute the mklink command.
Also, I'm uncertain about any performance issues with this.
A Junction
Apparently not allowed for individual files, only folders in windows.
Search folders (*.search-ms)
Maybe I'm missing something, but to my knowledge there is no way to specify individual files to be listed.
Hashing the files
Creating hash tags for all the files, would allow for the files to be stored once. But then I have no idea how I would handle the hash tags.
XML
Maybe I could use xml files to attach meta data to the files and somehow search using them.
Database File System
I recently came across this concept in my search. Not sure how it would apply Windows.
I have found a partial solution. First, I discovered that the laptop I'm using is actually logged in as Administrator. As an alternative to options 3 and 4, I have decided to use hard-links, which are part of the NTFS file system. However, due to the large number of files, this is unmanageable using the following command from an elevated command prompt:
mklink /h <source\file> <target\file>
Luckily, Hermann Schinagl has created the Link Shell Extension application for Windows Explorer and a very insightful reading of how Junctions, Symbolic Links, and Hard Links work. The only reason that this is currently a partial solution, is due to a separate problem with Windows Explorer, which I intend to post as a separate question. Thank you Hermann.
While building web applications often we have files associated with database entries, eg: we have a user table and each category has a avatar field, which holds the path to associated image.
To make sure there are no conflicts in filenames we can either:
rename files upon upload to ID.jpg; the path would be then /user-avatars/ID.jpg
or create a sub-directory for each entity, and leave the original filename intact; the path would be then /user-avatars/ID/original_filename.jpg
where ID is users's unique ID number
Both perfectly valid from application logic's point of view.
But which one would be better from filesystem performance point of view? We have to keep in mind that the number of category entries can be very high (milions).
Is there any limit to a number of sub-directories a directory can hold?
It's going to depend on your file system, but I'm going to assume you're talking about something simple like ext3, and you're not running a distributed file system (some of which are quite good at this). In general, file systems perform poorly over a certain number of entries in a single directory, regardless of whether those entries are directories or files. So no matter whether if you're creating one directory per image or one image in the root directory, you will run into scaling problems. If you look at this answer:
How many files in a directory is too many (on Windows and Linux)?
You'll see that ext3 runs into limits at about 32K entries in a directory, far fewer than you're proposing.
Off the top of my head, I'd suggest doing some rudimentary sharding into a multilevel directory tree, something like /user-avatars/1/2/12345/original_filename.jpg. (Or something appropriate for your type of ID, but I am interpreting your question to be about numeric IDs.) Doing that will also make your life easier later when you decide you want to distribute across a storage cluster, since you can spread the directories around.
Millions of entries (either files or directories) in one parent directory would be hard to deal with for any filesystem. While modern filesystems use sorting and various tree algorithms for quick search for the needed files, even navigating to the folder with Windows Explorer or Midnight Commander or any other file manager will be complicated as the file manager would have to read contents of the directory. The same applies to file search. So subdirectories are preferred for this.
Yet I need to notice that access to particular file would be a bit faster when all files are in one directory than when they are separated into subdirectories at least on NTFS (measured this myself several times with 400K files).
I've been having a very similar issue with html files not images. Trying to store millions of them in a Ubuntu server in ext4. Ended running my own benchmarks. Found out that flat directory performs way better while being way simpler to use:
Reference: article
If you really want to use files, maybe your best bet is to partition the files off into several subdirectories so that you don't hit a limit. For example, if you have an ID 123456, you can put it in /12/34/56.jpg.
However, I would recommend just using the database to store this data since you are already using one. You can store the image data and ID in the same table, and you don't have to worry about some of the pesky business of dealing with files like making sure the permissions are set right, etc.
I'm thinking of Mask as in a circuit Mask (I think)- let me explain with a handy chart
The common source would be physically in c:\source
Instance A would be physically in c:\instanceA but initially have nothing but symlinks to everything in c:\source
Instance B would be physically in c:\instanceB but initially have nothing but symlinks to everything in c:\source
As you made changes to Instance A and Instance B, you would have create a mask that would hide files from CommonSource if they were deleted from the Instance folders and create a new physical file in the instance directory if an existing Common Source file was modified.
New files would live in the instance folders but never make it back to the Common Source.
This type of setup would be very useful for a project where I want to do many different types of small tweaks to multiple instances where distinct threads would work on distinct instances.
I know about symbolic links but they fall short in the case of modifying a file.
Is there anything that can accomplish this? If not, should I try to make this and patent it? Seems like a good idea to me.
I would be on Windows Server 2008 or later.
Fearing I'm stating the obvious, but git is one tool that can be used to achieve this behavior.
Make your "Common Source" a git repository
Clone the repository twice to "InstanceA" and "InstanceB"
In each instance, check out a new, unique branch
As changes are made in "Common Source" you can merge those changes into "InstanceA" and "InstanceB" while maintaining the "MASK" (changes to the branch) you've created for each.
This has the added benefit of allowing changes from "Common Source" to be pulled as you wish instead of having changes to "Common Source" pushed out to each instance (something I imagine would be less desirable and more prone to error).
You're looking for a union mount. Unfortunately, I'm not aware of any implementations for Windows, but there are several available for Linux, notably UnionFS.
In general they are used for making a read-only filesystem look like it's read-write: typically on live-CDs.
Since Windows 7 you can use libraries, which will allow you to include files from more than one physical location.
Windows 7 also include VirtualStore type of folder (for example, when creating or modifying a file in Program Files folder, it will actually be created in a user specific folder:
C:\Users\user\AppData\Local\VirtualStore. However - I don't know how you can create this type of folders yourself, and also, as far as I know, you can add and modify files, but not delete files in that way.
You'll want a versioning control system that supports per file checkout and permissions. Then you just need to set up a simple API converter that takes file-system commands and converts them to versioning control commands.
Delete -> disable permission to access file.
Directory commands should look for local copies and things you have permission to access.
Open -> grab local copy, on fail check-out file from repository.
Save -> disable permission, save local copy. //Avoid duplicates being seen.
Close without saving -> if permission to access from repository, delete local copy.
((By the way, this storage optimization seems somewhat spurious for versioning. Disk space is relatively cheap.
If your interest isn't in versioning, I'd suggest looking into separating out the information you would potentially want as volatile and creating configuration files for each branch. This, of course, requires a predictable pattern to the changes.))
IBM Rational ClearCase is version control system which does file-mask-like behaviour. It is known as MVFS: MultiVersion File System and can be mount to a workstation like a ordinary network drive.
ClearCase server (aka. VOB) you can store several versions of the same file, each on different code branch. The sets of files visible by user are called views. Each view has a configuration (aka. configuration specification), which defines what files and versions are visible for current user. Typical file looks like this:
# From wikipedia: http://en.wikipedia.org/wiki/IBM_Rational_ClearCase#Configuration_specifications
# Show all elements that are checked out to this view, regardless any other rules.
element * CHECKEDOUT
# For all files named 'somefile', regardless of location, always show the latest version
# on the main branch.
element .../somefile /main/LATEST
# Use a specific version of a specific file. Note: This rule must appear before
# the next rule to have any effect!
element /vobs/project1/module1/a_header.h /main/proj_dev_branch/my_dev_branch1/14
# For other files in the 'project1/module1' directory, show versions
# labeled 'PROJ1_MOD2_LABEL_1'. Furthermore, don't allow any checkouts in this path.
element /vobs/project1/module1/... PROJ1_MOD2_LABEL_1 -nocheckout
# Show the 'ANOTHER_LABEL' version of all elements under the 'project1/module2' path.
# If an element is checked out, then branch that element from the currently
# visible version, and add it to the 'module2_dev_branch' branch.
element /vobs/project1/module2/... ANOTHER_LABEL -mkbranch module2_dev_branch
This is related to Wix:
I have a situation in which I have to deploy a file into multiple directories whose values being fetched from registry. Now these directories could be from 1 to many.
And I don't want to create too many Directory entries whose values would be determined at runtime.
Can we call a custom action in a loop which would be detecting the target Directories and setting-up our target folder values?
I know we can do such copying inside a Custom Action. But I'm looking for a way to do this via WIX entries.
I was reading about DuplicateFiles Action but not getting some proper methodology to achieve my goal.
Thanks a lot
The WiX element CopyFile maps to the DuplicateFiles action. You can use AppSearch to set properties and then use CopyFile to duplicate a file to a directory. DuplicateFiles is smart enough to not do anything if the property is null.
If the number of copies is known when you create your installer you can just do that. If you think it's going to somehow be more dynamic at runtime, you can write a custom action that emits temporary rows to the DuplicateFile table that way DuplicateFiles and RemoveDuplicateFiles still does the heavy lifting.
You can use the principals described at Dynamic Windows Installer UI.