Flocking Sim Configuration Tool

The second task for the flocking simulation is to create a tool for handling configuration of the simulation. We are given a list of options that need to be configurable. The data also needs to be saved so that the simulation can be recreated between computers.

One of the suggestions in class was to use C# Windows Forms to create the configuration tool. Since my only experience with C# so far is within Unity it seemed like a good opportunity to learn how C# works outside of Unity. My initial plan going into this task is to create a form that can save/load settings as an external file (.xml or possibly .csv) and launch the Flocking Simulation. The simulation itself will need to be altered to read in this file and use the data it contains when starting.

FlockingInputForm

To start with I looked into how to actually launch an external program from within C#. This ended up being fairly straightforward (Process.Start()). The issue however was passing the data entered into the form to the flocking sim when it was started. On the C# side it was simple, add the arguments to a string and put that string in the Process.Arguments variable. This meant that data was passed to the flocking sim in the same way you add parameters when launching a program from the command line.

The C++ side was a different story. At first I tried receiving the data as a string and converting that into an array of input variables. When this didn’t work I did some research and learned that I need to use argc and argv in order to get the input (int main(int argc, char** argv)). Now that I had the data i needed to convert it into a format that would be easily usable within the code. My first thought was to create a struct that contained all the relevant variables, meaning they could be accessed with inputData.agents.The issue with this idea was the amount of code required to manually convert each varaible into its specific variable in the struct. My alternative solution was to create an enum that named each variable we needed. By using the same enum on both the c++ and c# sides of the program we can ensure that values are in the correct order.

FlockingEnum

Once we convert the data from the form into an array:

FlockingConvert

Because the argv data also contains the name of the application (in argv[0]) we have to take the value +1 to place it in the write element of our data array. With the data in the array I can then call it wherever I need in the main function with inputData[enum name]:

FlockingConvert2

With that working I just need to get the rest of the data inputs placed on the Form itself and then add the relevant points to the c++ code.

Advertisements

Collaborative Art Task: Part 1

For this task we were required to create a client program that will send different types of drawing data to a server. Part of the limitations included the client not showing what has been drawn itself. I decided to write my client in C++ using SFML sprites for the visuals. To start I took the example code we had been shown in class for setting up networking and put it in its own header/cpp files so I could include it in later projects without needing to copy paste network code. With that in place and working I made a series of sprites to act as the visual elements of the client. I paired these sprites with bounding boxes (x, y coordinates and height/width values) that could be checked against the mouse position to see what is being hovered over. DrawClient01This was the final version I came up with. The user can select the drawing tool on the left side, and use the sliders to adjust the size and colour. The empty area within the green box is the draw space, anytime the user clicks in this space a packet is sent to the server. This version of the client does not have any control of the target port/IP to be sent to, it must be done in the code.

Flocking Sim Optimization: Update Loop Change

For this attempt at optimizing the flocking simulation I looked at making the update function more efficient. Currently the function will loop through the array of agents and, for each agent, loop through the array again checking which behaviour update function to call. I want to try and make this more efficient by having each check call the update of both agents. Which should in theory remove the need for the second agent to ever check itself against the first again. This would mean that as the update loop progresses each agent will have to check against one fewer agents than the last. My assumption in testing this is that there are enough duplicate checks occurring to cause a performance hit. If this isn’t the case (and should be tested with more agents than the current simulation in case there’s a tipping point) then the update calls themselves are causing the performance hit and optimizing the checks will have little to no effect.

To begin with I changed the Update() function in Scene.cpp. Changing the for loop to use an int for its for loop rather than an iterator, and then pass that int into the agents Update() function. Within FlockingAgent.cpp the Update() function now also accepts and int. This int is used to set the start point of the checking loop (current position + 1/everything AFTER this agent). We can then use the existing checks to determine what call the other agent needs to make. If agent 1 is a predator and 2 is a prey then we call PredChaseUpdate() for 1 and PreyFleeUpdate() for 2. We pass the position from in the scenes update() function as the position in the array for agent 2 to update against. The minor side effect of this change is that the agent no longer needs to check if its testing against itself as the loop begins 1 agent after it.

Running the base simulation on my PC starts at around 70FPS and builds to around 300FPS once the agents spread out slightly.

FlockingBase

Running the new version sees a very similar start up at around 90FPS which builds to around 310-320FPS once the agents spread out.

FlockingUpdateOpt

In the end it winds up being a very minor increase for what is ultimately a minor change. On the plus side this version does not change the agent behaviour at all (unlike my previous changes). Since I am seeing a small performance increase from this change I will leave it in moving forward. Once I have the user input system for the next stage of this task complete I will test if this change makes a noticeable difference when adding a greater number of agents to the simulation.

Flocking Sim Optimization: Get Nearest x

For this attempt at optimizing the flocking simulation I looked at how to cut down the number of updates per cycle. My idea was to do a larger update every x cycles that will create a short list of the nearest 10 prey and predators for each agent and then use that short list for every standard update call.

In terms of changes to the code this method only required a few changes/additions to FlockingAgent.cpp. Changes included a rewrite of the Update() function to handle the new system, overloaded versions of PredChaseUpdate(), PreyFleeUpdate(), and PreyPreyUpdate(), and the addition of a GetDistance() and GetNearestAgents() functions, along with some lists to store pointers to those agents. I included a timer that would be used to call GetNearestAgents() once per second (could also have done a version for every x frames)

The idea is that each time GetNearestAgent() is called it will begin checking the distance to each agent in the agent array and sorting them into the prey or predator lists. If a list exceeds 10 agents it will remove the last element. I considered doing a full sort and removing elements beyond 10 afterwards, but this method reduces the number of potential checks per agent by a large amount (checking if an element is 20th or 21st is a waste as they will be deleted).

Running the base simulation on my PC starts at around 70FPS and builds to around 300FPS once the agents spread out slightly.

FlockingBase

Running the new version using the Get Nearest Update() call sees a massive increase averaging around 700FPS from the moment the simulation starts.

FlockingGetNearest

Some points I noted while writing this:

  • I could probably use the existing Range functionality within the existing code, I was unsure of how it functioned however and wanted to get the concept working first.
  • I put a timer in every agent, and checked that against the time. Better version would do a check in the Scene.cpp, which would only require a single timer.
  • One of the requirements of the task is that changes do not cause massive changes in the outcome of the flocking (it uses the same starting seed, so should have nearly the same results every run). With this change I noted some minor changes in the way the agents clumped together (visible in the screenshot).
  • I considered putting GetNearestAgents() in a separate thread to increase the speed further. However this would almost certainly cause behaviour issues in the agents (some would be acting on new data while others work from previous).
  • When running in Debug there is a large pause every time GetNearestAgents() is called. While it doesn’t occur in Release mode currently, I am worried that it could become an issue with larger numbers of agents.

Extra Note:

I ran both simulations for around 10-15 each and found that this version doesn’t end up with the same kind of behaviour as the base simulation. Will need to try something else.

Flocking Sim Optimization: Threaded Update Call

For my first attempt at optimizing the flocking simulation I decided to look at multi threading options. Since they had made such a large difference in the Ray Tracer task it seemed like a good place to start. My initial attempts using OpenMP (#pragma omp parallel for) on a number of different for loops throughout the code yielded none or very little increase in speed.

I decided to take the next step and write my own thread using POCO Threads. This ended up causing me a lot of issues just trying to get the initial thread code to work. In the end I created a new class (UpdateThread()) that inherited from POCOs Runnable() class. With the thread working I tried placing the Update() call within this thread and running it constantly in parallel of the main draw call. My first iteration of this I checked if the thread was running and, if not, started it. This version worked however it created a small pause in the behaviour of the agents (independent of the frame rate). I tracked this down to my initial thread start call. If the update thread finished work moments after the main thread had checked if it was still running it would sit idle until the main thread had completed a full cycle and checked again.

I got around this by placing the update in a while loop. Now the thread would constantly run update and not create that pause effect. This version caused a crash whenever the program was closed however, as the second thread continued to try and run while the agents were being deleted/cleaned up. My last iteration created a new bool variable in the App.cpp class. Now the update threads while loop will check if it should continue looping. At the end of the Run() function I added a couple lines of code to set looping to false and wait for the update thread to finish its current cycle before cleaning up. This fixed the crash.

Running the base simulation on my PC starts at around 70FPS and builds to around 300FPS once the agents spread out slightly.

FlockingBase

Running the new version using the separate update thread sees a massive increase of around 700-800FPS from the moment the simulation starts.

FlockingThread01

The issue with this method is a pretty major one however. The simulations behaviour gets completely scrambled, after a few seconds (sometimes wildly different between each run) the agents will explode outwards.

FlockingThread02 FlockingThread03

Unfortunately when attempting to run the simulation in Debug mode now to try and find the cause of the behaviour change the compiler will error (I believe its a version issue with the included Poco Threads in VS2013). Since I have a couple other ideas for optimization that don’t rely on threading I will try them before coming back to fix this one.

Overall it was interesting to work with threading beyond OpenMP (which does a lot of the work for you) and I’m happy I did at least get something working, if not working well.

Using C++ in Unity

One of the tasks we have been given for Studio 3 this trimester is to help the Game Design students get feedback in Unity from a non standard external input device (i.e. a webcam, eye tracker, heartbeat monitor, etc.). The task I chose to work on involves using a webcam to determine if the user is looking at the keyboard/their hands while trying to touch type. We decided on using OpenCV as the base for our tool. There is also a library called OpenCVSharp that allows OpenCV functionality in C#.

The problem however is that most examples of eye or face tracking that we have found do not use OpenCVSharp, and I have run into problems trying to convert those projects across. I hit a snag yesterday while trying to rewrite a small open source gaze tracking project called eyeLike, which works on an image gradient-based eye center algorithm by Fabian Timm.

C++ OpenCV:

copy eyeLike c++

My conversion to OpenCVSharp:

copy eyeLike c#

This was working fine until I came across some function calls that I don’t believe I can convert to C# (as they use pointer math):

copy eyeLike c++ pointers

I’ve spent a few hours trying to think of a way to convert this function to C# and have come up with nothing. After class yesterday it was suggested that we create C# bindings for the program rather than trying to convert it. Since I hadn’t dealt with bindings or cross language coding in general I was reluctant to attempt it. After doing some research this morning however I think it will be possible (or at least worth trying). To start with I looked at options involved SWIG, COM, and Facade, which all seem to be programs for converting or creating bindings for an existing library. While it might be easier to go with one of these options I want to try a more manual approach (which will hopefully result in a better personal understanding of the process). I found this example on a blog that goes over creating a simple library in C++ and using it in C#.

Following the first example shown I was able to create a basic math library, build it as a .dll, and access its functionality in Unity by changing the example C# script slightly:

c# binding

and the result:

c# binding output

While it works in this case, the downside as mentioned in the blog post is that each function needs to be declared in the C# script, which would create a lot of extra code when dealing with a large library of functions. My next step will be to try and apply this to eyeLike and create a binding for C#

Ray tracing and optimization

Our first task this trimester for Studio 3 was to take a sphere only ray tracer written by Greg and attempt to optimize it any way we can. While the ray tracer was hard coded to render the same scene every run we should not use any specific solutions that would only work for this one scene. Optimizations should improve the quality of the ray tracer in general, not it’s ability to render this particular scene.

RayTracer0
First Pass (no optimizations) 55694ms

We had been given a few hints about what to try in order to optimize the ray tracer. The first of which was multithreading. The ray tracer currently runs on a single thread, so my first task was to enable OpenMP in visual studio. The main functionality of openMP that I was interested in was it’s ability to multithread a for loop. Since the ray tracer contained a number of for loops I first tried enabling it in different places and checking how it affected the render time.

RayTracer1

RayTracer3

RayTracer4

This last one ended up being a very poor option. Not only was it slower than the base render, but it also improperly calculated lighting due to the loop not being thread safe. The result was a slower render time and a large number of improperly calculated pixels.

RayTracer5

Note the speckled effect on the green spheres. Because of the ray tracer doing light calculations in parallel sometimes the correct light would be overwritten by the wrong one.

I decided that threading on the y loop of the main ray calculations would be best however, there was one obvious issue with the way OpenMP implemented its threading.

RayTracer2

OpenMP was splitting the for loop evenly into 4 chunks. Most of the heavy calculations are occurring in the bottom half of the render however. My solution was to break the loop into chunks of 4 lines (or however many cores the cpu has) and perform OpenMP’s threading on each of those. The result:

RayTracer6

RayTracer7
Per line threading: 20656ms

I was surprised to find this only shaved a few more seconds off the render time, but it appeared to render more efficiently, with all cores working until the render was complete.

Beyond threading I spent some time looking into different alternatives to improve the render time. Unfortunately I didn’t have time to figure out implementation for them, but would like to attempt them some time in the future. The first option I looked at was OpenCL and using it to perform ray trace calculations on the GPU rather than the CPU. The current implementation made this impossible (as recursion can’t be performed on the GPU) and I was unsure of how to restructure the code to make this possible. After discussion in studio I learned that this could be accomplished by rewriting the way rays are calculated. Currently the program will calculate 4 ray bounces for each pixel, one pixel at a time. If it was rewritten to perform all first ray bounces, then return results and perform all second ray bounces, then it could be passed to the GPU for processing.

The second option I considered was using subdivision to reduce the number of collision checks performed by each ray. Currently, each ray must check against every sphere in the scene for collision, this is actually the major time sink in the program. By subdividing the screen into smaller sections we can eliminate checking large blocks of spheres by ensuring they aren’t even in the same screen quarter as the ray. Again I was a little lost on how to implement this. But after seeing some other students ray tracers and discussing in studio I want to try and implement this myself at a later time.