Searching a C++ Vector for the first/last occurence of a value

I'm trying to work out the best method to search a vector of type "Tracklet" (a class I have built myself) to find the first and last occurrence of a given value for one of its variables. For example, I have the following classes (simplified for this example):

class Tracklet {
    TimePoint *start;
    TimePoint *end;
    int angle;

    public:
        Tracklet(CvPoint*, CvPoint*, int, int);
}

class TimePoint {
    int x, y, t;

    public:
        TimePoint(int, int, int);
        TimePoint(CvPoint*, int);
        // Relevant getters and setters exist here   
};

I have a vector "vector<Tracklet> tracklets" and I need to search for any tracklets with a given value of "t" for the end timepoint. The vector is ordered in terms of end time (i.e.tracklet.end->t).

I'm happy to code up a search algorithm, but am unsure of which route to take with it. I'm not sure binary search would be suitable, as I seem to remember it won't necessarily find the first. I was thinking of a method where I use binary search to find an index of an element with the correct time, then iterate back to find the first and forward to find the last. I'm sure there's a better way than that, since it wastes binary searches O(log n) by iterating.

Hopefully that makes sense: I struggled to explain it a bit! Cheers!

If the vector is sorted and contains the value,std::lower_boundwill give you an iterator to the first element with a given value andstd::upper_boundwill give you an iterator to one element past the last one containing the value. Compare the value with the returned element to see if it existed in the vector. Both these functions use binary search, so time is O(logN).

To compare ontracklet.end->t, use:

bool compareTracklets(const Tracklet &tr1, const Tracklet &tr2) {
    return (tr1.end->t < tr2.end->t);
}

and pass compareTracklets as the fourth argument tolower_boundorupper_bound

I'd just usefindandfind_end, and then do something more complicated only if testing showed it to be too slow.

If you're really concerned about lookup performance, you might consider a different data structure, like amapwith timestamp as the key and avectororlistof elements as the value.

A binary search seems like your best option here, as long as your vector remains sorted. It's essentially identical, performance-wise, to performing a lookup in a binary tree-structure.

dirkgently referredto a sweet optimization comparative. But I would in fact not use astd::vectorfor this.

Usually, when deciding to use a STL container, I don't really consider the performance aspect, but I do consider its interface regarding the type of operation I wish to use.

std::set<T>::find
std::set<T>::lower_bound
std::set<T>::upper_bound
std::set<T>::equal_range

Really, if you want an ordered sequence, outside of a key/value setup,std::setis just easier to use than any other.

  • You don't have to worry about inserting at a 'bad' position
  • You don't have problems of iterators invalidation when adding / removing an element
  • You have built-in methods for searching

Of course, you also want your Comparison Predicate to really shine (hopes the compiler inlines the operator() implementation), in every case.

But really, if you are not convinced, try a build with astd::vectorand manual insertion / searching (using the<algorithm>header) and try another build usingstd::set.

Compare the size of the implementations (number of lines of code), compare the number of bugs, compare the speed, and then decide.

Most often, the 'optimization' you aim for is actually apessimization, and in those rares times it's not, it's just so complicated that it's not worth it.

Optimization:

  • Don't
  • Expert only: Don't, we mean it

The vector is ordered in terms of time

The start time or the end time?

What is wrong with a naive O(n) search? Remember you are only searching and not sorting. You could use a sorted container as well (if that doesn't go against the basic design).

What Others Are Reading