Project Description:
As a learning exercise for asp mvc 4, I'm creating a site builder / multitenancy site. It's nothing too fancy just wysiwyg editing on templates with custom routing to direct users to the correct template based on subdomain. So usr1.mysite.com is directed to the template edited by usr1. My main concern at the moment is my method of storing the edited templates.
Storage Dilemma:
At first I was simply going to make the templates into views and store the changes made by the user in the database. When usr1's template was displayed the system would pull up the view and populate it with usr1's data.
Instead I've implemented a system that takes the user's modified template and saves the whole thing as static html files in the file system. Only the path to the usr1's site (and some other details) are saved in the database. When usr1.mysite.com is called I just have a "content" controller to retrieve the correct html file.
Question:
Is there any reason to choose the database/view method over the static html file method?
Also I'm not concerned with having dynamic content in the end user pages. This is one reason I even tried the file method.
Decision (EDIT):
I'm implementing the file method. After more research (verifying my previous research), I have few doubts the file system will have trouble with even a few hundred sites. I will structure it in a way to group user data directories into group directories based on a naming convention I've yet to dream up, probably something like 000usr1, 000usr2 in 000 group directory. With a goal of less than 100 files/folders in any given directory and less than 4 levels deep. Which should give me the capability of holding 10000 sites. I have no plans of having any activity near that level with this software, but I do want to get up and running and torture it for awhile and see what it's capable of handling. If anyone expresses any interest I'll post back some results.
Related
I am building a web application where the users can create reports and then upload some images for the created reports. Those images will be rendered in the browser when the user clicks a button on the report page. The images are confidential and only authorized users will be able to access them.
I am aware of the pros and cons of storing images in database, in filesystem or a service like amazon S3. For my application, I am inclined to keep the images in the filesystem and paths of the images in the database. That means I have to deal with the problems arising around distributed transaction management. I need some advice on how to deal with these problems.
1- I believe one of the proper solutions is to use technologies like JTA and XADisk. I am not very knowledgeable about these technologies but I believe 2 phase commit is how automicity is achieved. I am using MySQL as the database, and it seems like 2 phase commit is supported by MySQL. Problem with this approach is XADisk does not seem to be an active project and there is not much documentation about it and there is the fact that I am not very knowlegable about the ins and outs of this approach. I am not sure if I should invest in this approach.
2- I believe I can get away with some of the problems arising from the violation of ACID properties for my application. While uploading images, I can first write the files to disk, if this operation succeeds I can update the paths in the database. If database transaction fails, I can delete the files from the disk. I know that is still not bulletproof; an electricity shortage might occur just after the db transaction or the disk might not be responsive for a while etc...I know there are also concurrency issues, for instance if one user tries to modify the uploaded image and another tries to delete it at the same time, there will be some problems. Still the chances for concurrent updates in my application will be relatively low.
I believe I can live with orphan files on the disk or orphan image paths on the db if such exceptional cases occur. If a file path exists in db and not in the file system, I can show a notification to the user on report page and he might try to reupload the image. Orphan files in the file system would not be too much problem, I might run a process to detect such files time to time. Still, I am not very comfortable with this approach.
3- The last option might be to not store file paths in the db at all. I can structure the filesystem such that I can infer the file path in code and load all images at once. For instance, I can create a folder with the name of report id for each report. When a request has been made to load images of the report, I can load the images at once since I know the report id. That might end up with huge number of folders in the filesystem and I am not sure if such a design is acceptable. Concurrency issues will still exist in this scheme.
I would appreciate some advice on which approach I should follow.
I believe you are trying to be ultra-correct, and maybe not that much is needed, but I also faced some similar situation some time ago and explored also different possibilities. I disliked options aligned to your option 1, but about the 2 and 3, I had different successful approaches.
Let's sum up first the list of concerns:
You want the file to be saved
You want the file path to be linked to the corresponding entity (i.e the report)
You don't want a file path to be linked to a file that doesn't exist
You don't want files in the filesystem not linked to any report
And the different approaches:
1. Using DB
You can assure transactions in the DB pretty much with any relational database, and with S3 you can ensure read-after-write consistency for both new objects and upload of new objects. If you PUT an object and you get a 200 OK, it will be readable. Now, how to put all this together? You need to keep track of the process. I can figure 2 ways:
1.1 With a progress table
The upload request is saved to a table with anything need to identify this file, report id, temp uploaded file path, destination path, and a status column
You save the file
If the file safe fails you can update the record in the table, or delete it
If saving the file is successful, in a transaction:
update the progress table with successful status
update the table where you actually save the relationship report-image
Have a cron, but not checking the filesystem, but checking the process table. If there is any file in the filesystem that is orphan, definitely it had been added to the table (it was point 1). Here you can decide if you will delete the file, or if you have enough info, you can continue with the aborted process triggering the point 4.
The same report-image relationship table with some extra status columns.
1.2 With a queue system
Like RabbitMQ, SQS, AMQ, etc
A very similar approach could be done with any queue system instead of a db table. I wont give much details because it depends more on your real infrastructure, but just the general idea.
The upload request goes to a queue, you send a message with anything you may need to identify this file, report id, and if you want a tentative final path.
You upload the file
A worker reads pending messages in the queue and does the work. The message is marked as consumed only when everything goes well.
If something fails, naturally the message will come back to the queue
In the next time a message is read, the worker can have enough info to see if there is work to resume, or even a file to delete if resuming is not possible
In both cases, concurrency problems wont be straightforward to manage, but can be managed (relying on DB locks in fist case, and FIFO queues in second cases) but always with some application logic
2. Without DB
To some extent a system without a database would be perfectly acceptable, if we can defend it as a proper convention over configuration design.
You have to deal with 3 things:
Save files
Read files
Make sure that the structure of the filesystem is manageable
Lets start with 3:
Folder structure
In general, something like one folder for report id will be too simple, and maybe hard to maintain, and also ultimately too plain. This will cause issues, because if we have a folder images with one folder per report, and tomorrow you have less say 200k reports, the images folder will have 200k elements, and even an ls will take too much time, same for any programing language trying to access. That will kill you
You can think about something more sophisticated. Personally like a way that I learnt from Magento 1 more than 10 years ago and I used a lot since then: Using a folder structure following first outside rules, but extended with rules derived extended with the file name itself.
We want to save a product image. The image name is: myproduct.jpg
first rule is: for product images i use /media/catalog/product
then, to avoid many images in the same one, i create one folder per every letter of the image name, up to some number of letters. Lets say 3. So my final folder will be something like /media/catalog/product/m/y/p/myproduct.jpg
like this, it is clear where to save any new image. You can do something similar using your reports id, categories, or anything that makes sense for you. The final objective is to avoid too flat structure, and to create a tree that makes sense to you, and also that can be automatized easily.
And that takes us to the next part:
Read and write.
I implemented a similar system before quite successfully. It allowed me to save files easy, and to retrieve them easily, with locations that were purely dynamic. The parts here were:
S3 (but you can do with any filesystem)
A small microservice acting as a proxy for both read and write.
Some namespace system and attached logic.
The logic is quite simple. The namespace lets me know where the file will be saved. For example, the namespace can be companyname/reports/images.
Lets say a develop a microservice for read and write:
For saving a file, it receives:
namespace
entity id (ie you report)
file to upload
And it will do:
based on the rules I have for that namespace, and the id and file name will save the file in this folder
it doesn't return the physical location. That remains unknown to the client.
Then, for reading, clients will use a URL that uses also convention. For example you can have something like
https://myservice.com/{NAMESPACE}/{entity_id}
And based on the logic, the microservice will know where to find that in the storage and return the image.
If you have more than one image per report, you can do different things, such as:
- you may want to have a third slug in the path such as https://myservice.com/{NAMESPACE}/{entity_id}/1 https://myservice.com/{NAMESPACE}/{entity_id}/2 etc...
- if it is for your internal application usage, you can have one endpoint that returns the list of all eligible images, lets say https://myservice.com/{NAMESPACE}/{entity_id} returns an array with all image urls
How I implemented this was with quite simple yml config to define the logic, and very simple code reading that config. That allowed me to have a lot of flexibility. For example save reports in total different paths or servers or s3 buckets if they belong to different companies or are different report types
I have a web application developed with JSF 2 and primefaces. The project has been frozen for months, but it's quite advanced, the whole application run inside the same container under glassfish, so it's a monolith.
My application has an user interface and its purpose is to offer them the possibility to organize urls to tutorials (any kinds) as cards, with tags for the classification, into folders. So any user has its own tree, they can make a research inside the other users's tree create a link on a file in their own tree, copy a entire folder, reorganize it etc.
Nowedays we hear a lot about microservices, Spring boot, Angular Js, react etc. I like to develop with JSF it's a great framework, but I'm asking myself about refactoring my application, at least the necessary parts into microservices, and if JSF is appropriate for that or if I should user other tools.
What I like for example with JSF is the facility to create views, its component approach, and how it handle the full cycle of a request.
For example with a simple folder creation form :
I have to choose the parent folder, so I can bind a research component to a backing bean that makes a research indirectly in my DB using a DAO ( in my app an EJB using JPA). That happens at the "invoke application" phase and refresh my form list with ajax at the end. When I submit the form I can also bind a converter to the research component to retrieve directly a Folder object, the converter uses also a DAO to retrieve the object that I need at the "Invoke application" phase to finish the job.
I also use validators to control different attributes of a new folder, usually I declare them inside my entity class (Folder, User ...) with annotations like #NotNull etc. Before I save the folder on my db, I also check the user rights to see if he can write inside the parent folder and so on. I do that inside the backing bean, so at the 'invoke application' phase, and return a faces message if anything happens wrong.
When I read about micro-services I see that you can use them directly inside a form using json for communication, so it seems quite different. For example if I have a micro-service for the CRUD operations of my folders, are the validators, the converters, part of the service or are they stand alone services ? And what about the security checks ? that kind of architecture is quite mysterious to me.
ps : English is not my mother tongue so be indulgent please :)
AngularJs is pretty ancient man :)
You have to look at the pain points to identify ways to tear down your monolith. Monolith pains are usually slow and painful dev cycle and difficult manual test phases. If you did the entire arquillian thing and have full continuouos integration with single button deployments, you've slain the beast the hard way. Not many braved this route. But if you're looking at mounting feature creep with code freezes and manual test cycles then yeah you kind of want to try to pull some of those features out into a service you can redeploy very quickly
I have recently been asked to take over the administration of a website that is built on CakePHP 3.x.
I have never worked with CakePHP before. Everything I have read talks about using a command line interface, but I haven't done this since I was in Uni.
I discovered a Dashboard on the website where I can enter or edit the products, but I was wondering about the pages on the site.
I had to change some phone numbers in the footer of each page and it was only hunting through the files that I found src/Template/Element/footer.ctp and edited this.
Is there some way of editing the pages without finding the individual files?
No. What you're referring to (the command-line stuff) is for when you're baking files, running shell tasks, doing database migrations, installing things via composer, or using the built-in local server...etc. There are other uses too, but editing front-end files is not usually one of them.
Though there are methods for altering local files via command line, for the things you're talking about, like editing a footer, or other pages (.ctp "Template files" in Cake 3), it's standard practice to just do that manually.
See the standard path for template files in these examples:
src/Template/Users/profile.ctp
src/Template/Pages/contact.ctp
src/Template/Layout/default.ctp
A "layout" file usually fetches the header, content, and footer.
As you've found, there are also Elements, which are smaller chunks of code that are reusable across one or more Template files.
I have recently taken up salesforce.com and i have very little idea about its workings and stuff. Recently i was going through some of the stuff and i had a doubt...
Is there any way through which i can find out where an particular field/ object/ visualforce page has been used in an application. For example lets say i have a field labeled Sales, i want to be able to find where that particular field is used, under which object, and the object in which visualforce page/ Apex class, and the visualforce page/ Apex class is used in which application.
Hope i have made my Q clear.
Thanking everyone for their help
It's not really a programming question, you might be better off asking about administrative stuff like that on salesforce.stackexchange.com.
If you have a test environment (sandbox) - you could always try deleting the field there ;) I'm kidding but if you'll try it the page should display you a list where the field is being used.
Similar thing could be achieved by creating a changeset, adding that field to it and then checking dependencies.
But probably the best way would require some preparations upfront. Read about Force.com IDE (or Eclipse IDE) and how to use it to download files that represent your object definitions, page layouts, classes, visualforce pages, reports... This is great as backup but also will let you search the files (Ctrl+H in Eclipse or just use whatever you want once you have the files locally). Search for API name of the field (similar to My_Custom_Field__c should be most effective.
Pretty old thread but adding another option. I have a free and open source app that scans the fields in your Org and returns the components it's used in (Workflows, Processes, Page Layouts, Apex etc).
Keep in mind that returning fields in Apex and VF is not 100% accurate, as a field with the same API name on different objects would return as being found in a class, even though it might not (as others have mentioned).
Also, it can take quite a while to run on large Orgs.
App: http://schemalister.herokuapp.com/
Source Code: https://github.com/benedwards44/schemalister
in our corporate directory users can search for their coworkers. Results are then displayed in a table-like layout on a plain HTML page (the backend is PHP if that's any matter). The list is limited to 25 entries.
Now the request has come up to show the presence status from Microsoft Lync next to every entry in that list. Creating a tiny Silverlight application to represent the status of one single person is quite easy, placing it to the left of each name is also.
This way of doing it will of course result in up to 25 almost-identical Silverlight objects beeing created and then accessing the Lync client API.
Another way to do it would be to place the complete listing inside a (more complex) Silverlight application, so that there wouldn't be but one instance on the page. This would also cause quite some extra development work.
The question: Is it considered bad practice to create 25 instances of the same Silverlight object on one single web page?
Thanks for any input or opinion you can give,
Patrick
If you're using this in an internal corporate environment and getting the finished product out quickly is important then you're likely fine. Each Silverlight object will need to query the Lync status of each employee so that may be a deciding factor in terms of performance.
Alternatively, it wouldn't be all the difficult or labor intensive to create a simple single user control representing a single user's Lync status and then display all inside of a single Silverlight app.