boost log every hour - file

I'm using boost log and I want to make basic log principal file: new error log at the beginning of each hour (if error exists), and to name it like "file_%Y%m%d%H.log".
I have 2 problems with this boost library:
1. How to rotate file at the beginning of each hour?
This isn't possible with rotation_at_time_interval parameter because it creates new file regarding first written record in file, and the hour in file name doesn't match that rule. Is it possible to have multiple rotation_at_time_point for one file in sink or is there some other solution?
2. When file exceed some size I want it to start new file and in that case it should append some index to file name. With adding rotation_size parametar and %N to file name it will increment N all the time while application is running. I want that N to be reset at the beginning of each hour, just as my file name changes. Does anybody have any idea how to do that with this boost log library?
This is basic principal in creating log files in industry. I really don't understand how this can't be done with library which is dedicated for creating log files.

Library itself doesn't provide a way to rotate file at the begging of every hour, but i had same problem so i used a function wrapper, which return true on begging of every hour.
I find this way better for me, because i can controll efficency of code.
from boost.org:
bool is_it_time_to_rotate();
void init_logging(){
boost::shared_ptr< sinks::text_file_backend > backend =
boost::make_shared< sinks::text_file_backend >(
keywords::file_name = "file_%5N.log",
keywords::time_based_rotation = &is_it_time_to_rotate
);
}
For a second question i really dont undrestand it well.

Related

Check whether file exists in server with partial name

I want to search for a file, say with name having the date time stamp (DDMMYYYYhhmmss)(14122017143339). But in the server possibilities are there that the filename which I am expecting can be like either (14122017143337 OR 14122017143338 OR 14122017143339 OR 14122017143340) as there is a minute change in the seconds.
Now, I am trying to search for the file with only a portion of its name say like (DDMMYYYYhhmm)only uptil the minute. Meaning the file which i am expecting should contain the string (141220171433) in its name.
Can someone help on how can we achieve this Using Java?
Note - Am using Selenium for my coding purposes.
Below code in in Java and will find all files in a folder. you can search for required file name to match
File[] allFiles = new File("FOlder path").listFiles();
for (File f : allFiles)
{
if (f.isFile())
{
if(file.getName().contains("DDMMYYYYhhmm"))
{
System.out.println("true and file found");
// do something here
}
}
}

Building a Route to output differences of two files

Trying to put together a file diff route... could someone help? here is what I have ->
CsvDataFormat csv = new CsvDataFormat();
csv.setDelimiter(",");
from("file:inputdir?delete=true&sortBy=ignoreCase:file:name")
.unmarshal(csv)
.pollEnrich("file:backup?fileName=test.csv&sendEmptyMessageWhenIdle=true")
.unmarshal(csv)
// Need to aggregate here!!!!
.log("test");
A csv file gets dropped in the /input directory and then a backup file is consumed from the /backup directory. I would like to compare these two files and output the difference.
This is not a specific Camel problem. In order to solve this problem you may implement a diff functionality on your own, or you may use an existing library such as java-diff-utils.
Pseudocode:
// read file 1 into a list "list1"
// read file 2 into a list "list2"
// use java-diff-utils to calculate the difference
Patch patch = DiffUtils.diff(list1, list2);

find file content size docx,pptx etc

I want to find out the size of the content inside a docx,pptx etc. Is there any package which can be used for this? I googled and found that POI is used widely to read/write to MS file types. But not able to find the correct api to find the size of the file content. I want to know the actual content size not the compressed file size which can be seen from properties.
Finally i found the way, but it is throwing OOM exception if the file is too large.
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XWPFDocument doc = new XWPFDocument(opcPackage);
XWPFWordExtractor we = new XWPFWordExtractor(doc);
String paragraphs = we.getText();
System.out.println("Total Paragraphs: "+paragraphs.length() / 1024);
Please help me if there are any other better way to do this.
Ok this has been asked long time ago and there is also no response to this question. I have not used OPCPackage and hence my answer is not based on that.
DOCX (and for that matter PPTX as well as XSLX) files are all zip files having a particular structure.
We could hence use the java.util.zip package and enumerate the entries of the zip file and get the size of the zip entry xl for xlsx file and word for docx files. Probably a more generic method would be to ignore the following top-level zip entries i.e. zip entries starting with :
docProps
_rels
[Content_Types].xml
The size of the remaining zip entry (do not ignore any folder within this zip entry) would tell you the correct size of the content.
This method is also very efficient - you only read the entries of the zip file and not the zip file itself hence obtaining the size information would run with negligible time and memory resources. For a quick start I was able to get the size of a 4MB docx file in fraction of a second.
A "good-enough" but not adequately working piece of code using this approach is pasted below. Please feel free to use this as a starting point and fix bugs if found. It would be great if you can post back the modifications or corrections so that others can benefit
private static final void printUnzippedContentLength() throws IOException
{
ZipFile zf = new ZipFile(new File("/home/chaitra/verybigfile.docx"));
Enumeration<? extends ZipEntry> entries = zf.entries();
long sumBytes = 0L;
while(entries.hasMoreElements())
{
ZipEntry ze = entries.nextElement();
if(ze.getName().startsWith("docProps") || ze.getName().startsWith("_rels") || ze.getName().startsWith("[Content_Types].xml"))
{
continue;
}
sumBytes += ze.getSize();
}
System.out.println("Uncompressed content has size " + (sumBytes/1024) + " KB" );
}

Importing Excel file with dynamic name into SQL table via SSIS?

I've done a few searches here, and while some issues are similar, they don't seem to be exactly what I need.
What I'm trying to do is import an Excel file into a SQL table via SSIS, but the problem is that I will never know the exact filename. We get files at no steady interval, and the file usually has a date/month in the name. For instance, our current file is "Census Data - May 2013.xls". We will only ever load ONE file at a time, so I don't need to loop through a directory for multiple Excel files.
My concept is that I can take this file, copy it to a "Loading" directory, and load it from there. At the start of the package, I will first clear out the loading directory, then scan the original directory for an Excel file, copy it to the loading directory and then load it into SQL. I suppose I may have to store the file names somewhere so I don't copy the same file into the loading directory in subsequent months, but I'm not really sure of the best way to handle that.
I've pretty much got everything down except the part that scans the directory for the Excel file and copies it to the loading directory. I've taken the majority of my info from this page, which (again) is close to what I want to do but not quite exactly the solution I need.
Can anyone get me over the finish line? I can't seem to get the Excel Connection Manager right (this is my first time using variables), and I can't figure out how to get the file into the Loading directory.
Problem statement
How do I dynamically identify a file name?
You will require some mechanism to inspect the contents of a folder and see what exists. Specifically, you are looking for an Excel file in your "Loading" directory. You know the file extension and that is it.
Resolution A
Use a ForEach File Enumerator.
Configure the Enumerator with an Expression on FileSpec of *.xls or *.xlsx depending on which flavor of Excel you're dealing with.
Add another Expression on Directory to be your Loading directory.
I typically create SSIS Variables named FolderInput and FileMask and assign those in the Enumerator.
Now when you run your package, the Enumerator is going to look in Diretory and find all the files that match the FileSpec.
Something needs to be done with what is found. You need to use that file name that the Enumerator returns. That's done through the Variable Mappings tab. I created a third Variable called CurrentFileName and assign it the results of the enumerator.
If you put a Script Task inside the ForEach Enumerator, you should be able to see that the value in the "Locals" window for #[User::CurrentFileName] has updated from the Design time value of whatever to the "real" file name.
Resolution B
Use a Script Task.
You will still need to create a Variable to hold the current file name and it probably won't hurt to also have the FolderInput and FileMask Variables available. Set the former as ReadWrite and the latter as ReadOnly variables.
Chose the .NET language of your choice. I'm using C#. The method System.IO.Directory.EnumerateFiles
using System;
using System.Data;
using System.IO;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
namespace ST_fe2ea536a97842b1a760b271f190721e
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
string folderInput = Dts.Variables["User::FolderInput"].Value.ToString();
string fileMask = Dts.Variables["User::FileMask"].Value.ToString();
try
{
var files = Directory.EnumerateFiles(folderInput, fileMask, SearchOption.AllDirectories);
foreach (string currentFile in files)
{
Dts.Variables["User::CurrentFileName"].Value = currentFile;
break;
}
}
catch (Exception e)
{
Dts.Events.FireError(0, "Script overkill", e.ToString(), string.Empty, 0);
}
Dts.TaskResult = (int)ScriptResults.Success;
}
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
}
}
Decision tree
Given the two resolutions to the above problem, how do you chose? Normally, people say "It Depends" but there only possible time it would depend is if the process should stop/error out in the case that more than one file did exist in the Loading folder. That's a case that the ForEach enumerator would be more cumbersome than a script task. Otherwise, as I stated in my original response that adds cost to your project for Development, Testing and Maintenance for no appreciable gain.
Bits and bobs
Further addressing nuances in the question: Configuring Excel - you'll need to be more specific in what isn't working. Both Siva's SO answer and the linked blogspot article show how to use the value of the Variable I call CurrentFileName to ensure the Excel File is pointing to the "right" file.
You will need to set the DelayValidation to True for both the Connection Manager and the Data Flow as the design-time value for the Variable will not be valid when the package begins execution. See this answer for a longer explanation but again, Siva called that out in their SO answer.

Saving a Binary tree to a file in C

I am currently doing a Customers program when a user can add/edit/search/list customers. I decided to use a binary tree as the backbone of this program. My idea was to then save each item in the tree to "customers.dat" right before the program closes, then on start up load everything from the file to the tree. So far so good, however after finally managing to save the binary search tree into a file, I have one bug.
Lets say I add 3 customers the first time. I then close the program, and when I reopen it I will find the same 3 customers in the tree. however, the next time I open the file, it gives me an error from one of my predefined errors, which occurs when a node is not able to identity whether to go left or right, maybe because its empty or uncomparable. here are some code snippets. I aslo tried using other fileopen techniques other than a+b, and I had no such error, but by the way I designed my program, I need the append method or else only one record will save.
Customers are stored in Cstmr in the header:
typedef struct customer
{
char Name[MAXNAME];
char Surname[MAXNAME];
char ID[MAXID];
char Address[MAXADDRESS];
} Cstmr;
else:
void CustomerTreeToFile(Tree*pt)
{
if (TreeIsEmpty(pt))
puts("Nothing to save!");
else
Traverse(pt,saveItem); //Traverses each node, and appliess the function
//saveItem to each node
}
void saveItem(Cstmr C)
{
save = C;
customers = fopen("customers.dat","ab+");
fwrite(&C,sizeof (Cstmr), 1, customers);
fclose(customers);
}
The problem is that you are always appending all the data to the same file ... so everything from the previous run is duplicated.
One solution would be to delete (unlink) the file before saving the data with your current approach.
However, as Freezerburn pointed out earlier, it's more economic to not open and close the file for each item. Just open the file once in overwrite mode (i.e. not append), then write all the data, then close the file. Should be much faster, too.
Another problem is that you save your data in binary format. Try to define an easy to read textual format. That would have made the problem obvious ...

Resources