How to retrieve all object IDs? - c

I am trying to get a list of all object IDs in a git repository, using libgit2. I can't seem to find any method for this. Does libgit2 have a method to get all object IDs (or iterate through them), or do I need to read them manually?

What you may be looking for is the revision walking API.
Description of the feature can be found here.
A test demonstrating different walking strategies may also provide you with some help
Edit: A thread in the libgit2 mailing list specifically deals with this.
A more precise answer from Vicent Marti (libgit2 maintainer) is
... Just push every single HEAD into the the walker. You won't
get any duplicate commits.
All you have to do is to push every branch and tag oids into the revision walker to recursively walk the commit history. Please note this won't retrieve dangling commits (commits or chain of commits that are not referenced by a branch nor a tag).
Edit 2: This behavior (similar to git log --all) has been successfully implemented in libgit2sharp (libgit2 .Net bindings).
Edit 3: A new feature has recently been merged which would allow to enumerate all the objects (commits, trees, blobs, ...) stored in the object database: git_odb_foreach().
This would be more in line with the git fsck scenario #MatrixFrog was talking about.
git_odb_foreach() documentation
A simple test demonstrating how to use the API

Related

Polarion document baseline with links in description field

I have a generic LiveDoc in Polarion which contains a series of referenced requirements. Recently I started to insert links into the description of some of the requirements to make it easier to navigate from one requirement to another. However, I've discovered that when I baseline the document the links in the description don't get updated to point to the baselined version of the requirement, but the links (to the same requirement) in the Linked Work Items section are updated to include the baseline revision.
Is there a way to get the links in the description to point to the baselined revision like the ones in the Linked Work Items section?
I'm using Polarion 21 R1 if that matters.
Thanks in advance for your help.
Interesting approach. I doubt you get this working 100%. HTML is notoriously hard to parse(complete and correctly), so you should avoid this workflow.
Use Linked Workitems instead and use the new Collection Feature, which most probably does what you need.
Also while it is possible to link to older / specific revisions (of artefacts) in Polarion, I never found a scenario which was maintainable and useful in same time.
Note that Revisions get big very fast (5-7 digits). Comparing or updating these links is very error prone and demanding work, full of devastating pitfalls.
We follow the approach to keep items unchanged after release and create new items instead of changing existing ones. We have then more WIs but Polarion's UI (and most peoples head) can deal with large number of WIs better than with versioned links.

Add new experimenter match field in OVS source code

I would like to add new match field of type OXM_experimenter class OVS source code, could anyone share proper document or steps to do it. It needs changes in many of the files and functions and understanding OVS source is somewhat difficult. If any added already and tested, can you guide ?
I successfully did this before, however, I don't have access to the code anymore, only bookmarks to stuff. There is an old thread in the mailing lists that may help you: Link and Link.
I wanted to handle PACKET_IN events in OVS a bit differently, so I followed the way of packets from the data plane through the upcall bit to ofproto-dpif-xlate.c. On the way, I stumbled upon a lot of constants. After adding my own to the enums, the last bit missing was the experimenter field, which was somewhere in the python scripts as described in the links above.
I hope this helps, I'm in the process of getting access to the code again I'll update my answer then. If not, the OvS discuss mailing list and archives may help you too.

database modeling for google app engine for multiple revison of entity

in my application ( kind of wiki clone ) - an article is frequently changing.
and i need to track all changes that are done on that article. { text only. }
one crude way i have done it, is to add a datetime property and create a new entity everytime something change. which is too much database wasting. { and also un-necessary index waste too. } and also need to re-create parent-child and entity relationships.
i also have log which can show changes -- but i want some thing easier , so that jumping from one version to another version could be easier.
ideas ?
thanks.
You could split the location in your wiki from the content, and link from there to the version of the page. Keep the versions in a linked list, doubly linked if you 2 way navigation, circular, whatever.
The parent child, indexing and so forth deal only with the location and the article linked to that.
Reverting to a previous change is only changing the link in the location (and pushing the changes to your indexing machine). Pruning are basic list operations, i.e. pointing the next field to a version further down and deleting the version in between.

Repository Pattern Question

I'm building an ASP.NET MVC app and I'm using a repository to store and retrieve view objects. My question is, is it okay for the implementation of the various repositories to call each other? I.E. can the ICustomerRepository implementation call an implementation of IAddressRepository, or should it handle its own updates to the address data source?
EDIT:
Thanks everyone, the the Customer/Address example isn't real. The actual problem involves three aggregates which update a fourth aggregate in response to changes in their respective states. I in this case it seems a conflict between introducing dependencies vs violating the don't repeat yourself principle.
You should have a repository for each aggregate root.
I have no knowledge of your domain-model, but it doesn't feel natural for me to have an IAddressRepository. Unless 'Address' is an aggregate root in your domain.
In fact, in most circumstances, 'Address' is not even an entity, but a value object. That is, in most cases the identity of an 'Address' is determined by its value (the value of all its properties); it does not have a separate 'Id' (key) field.
So, when this is the case, the CustomerRepository should be responsible for storing the Address, as the Address is a part of the Customer aggregate-root.
Edit (ok so your situation was just an example):
But, if you have other situations where you would need another repository in a repository, then I think that it is better to remove that functionality out of that repository, and put it in a separate class (Service).
I mean: I think that, if you have some functionality inside a repository A, that relies on another repository B, then this kind of functionality doesn't belong inside repository A.
Instead, write another class (which is called a Service in DDD), in where you implement this functionality.
Anyway, I do not think that repositories should really call each other. If you do not want to write a Service however, and if you really want to keep that logic inside the repository itself, then pass the other repository as an argument in that specific method.
I hope I made myself a bit clear. :P
They really shouldn't call each other. A Repository is an abstraction of the (more or less) atomic operations that you want to perform on your domain. The less dependency they have, the better. Realistically, any consumer of a repository should expect to be able to point the repository class at a database and have it perform the necessary domain operations without a lot of configuration.
They should also represent "aggregates" in your domain - i.e. key focal points that a lot of functionality will be based around. I'm wondering why you would have a separate address information repository? Shouldn't that be part of your customer repository?
This depends on the type of repository (or at least the consequences do) but in general if you have data repositories calling each other you're going to run into problems with things like cyclical (repo A -> requires B -> requires C -. oops, requires A) or recursive data loads (A-> requires B & C -> C-requires D, E -> .... ->..ad nauseum). Testing also becomes more difficult.
For example, you need to load your address repository to properly run your customer repository, because the customer repository calls the address repo. If you need to test the customer repo, you'll need to do a db load of the addresses or mock them in some way, and ultimately you won't be able to load and test any single system repository without loading them all.
Having those dependencies is also kind of insidious because they're often not clear - usually you're dealing with a repository as a data-holding abstraction - if you have to be conscious of how they depend on each other you can't use them as an abstraction, but have to manage the load process whenever you want to use them.

How best to branch in Clearcase?

I've previously documented my opinions on Clearcase as a source control system, but unfortunately I am still using it. So I turn to you guys to help me alleviate one of my frustrations.
We have just moved from a one-branch-per-developer system, to a one-branch-per-task in an attempt to improve some of the issues that we've been having with determining why certain files were changed. Generally I am happy with the solution, but there is one major issue. We are using simple scripts to start and end tasks which create a new branch named with the username and task number and then updates the local snapshot view to have a config spec similar to the following:
element * CHECKEDOUT
element * .../martin_2322/LATEST
element * /main/LATEST -mkbranch martin_2322
load /Project/Application
Let's say that my project has two coupled files A.cs and B.cs. For my first task I make changes to A on the branch. Then I need to stop working on task 2322 for whatever reason and start work on task 2345 (task 2322 is not finished, so I don't merge it back into main).
I create a new task branch 2345, edit both A.cs and B.cs and merge the results back into main. Now I go back to work on 2322, so I change my config spec back to one defined above. At this point I see the A.cs file from the task branch (as I edited it earlier, so I get the version local to that branch) and the latest version of B.cs from main. Since I don't have the changes made to A.cs on the 2345 branch the build breaks. What I need instead is to be able to pick up task 2322 from where I left off and see it with the old version of A.cs - the one that was latest in main when the branch was created.
The way I see it I have a few options to fix this:
Change the config spec so that it gets files from main at the right date. This is easy enough to do if I know the date and don't mind setting it by hand, but I can't figure out how to automate this into our task switching scripts. Is there anyway to get the creation date of a branch?
Create a label for each branch on main. Theoretically simple to do, but the labelling system in our install of CC is already collapsing under the weight of a few hundred labels, so I don't know if it will cope with one per developer per branch (notice that the task in my example is 2322 and we're only about an quarter of the way through the project)
Merge out from main into the task branch. Once again should work, but then long running branches won't just contain the files changed for that task, but all files that needed to be merged across to get unrelated things working. This makes them as complicated as the branch-per-developer approach. I want to see which files were changed to complete a specific task.
I hope I'm just missing something here and there is a way of setting my config spec so that it retrieves the expected files from main without clunky workarounds. So, how are you guys branching in Clearcase?
A few comments:
a branch per task is the right granularity for modifying a set of file within a "unit of work". Provided the "task" is not too narrow, otherwise you end up with a gazillon of branches (and their associated merges)
when you create a config spec for a branch, you apparently forget the line for new elements (the one you "add to source control")
Plus you may consider branching for a fix starting point, which would solve the "old version of A.cs - the one that was latest in main when the branch was created" bit.
I know you have too much labels already, but you could have a script to "close" a task which would (amongst other things) delete that starting label, avoiding label cluttering.
Here the config spec I would use:
element * CHECKEDOUT
element * .../martin_2322/LATEST
element * STARTING_LABEL_2322 -mkbranch martin_2322
# selection rule for new "added to source control" file
element * /main/0 -mkbranch martin_2322
load /Project/Application
I would find this much more easier than computing the date of a branch.
do not forget you can merge back your task to main, and merge some your files from your finished task branch to the your new current task branch as well, if you need to retrofit some your fixes back to that current task as well.
You can get the creation date for a branch by using the describe command in cleartool.
cleartool describe -fmt "%d" -type martin_2322
This will printout the date and time that the branch was created. You can use this to implement your first option. For more information, you could read the following cleartool man pages, but hopefully the above command is all you need.
cleartool man describe
cleartool man fmt_ccase
We use Clearcase, and we find that creating a branch for a release is often much easier than doing it by task. If you do create it by task, then I'd have a 'main branch' for that release, and branch the tasks off that branch, and then merge them back in when finished to merge them back to the trunk.

Resources