Izumi: Ralf - Xeres
Index: Home     | What Is Izumi | Misc Links   | Random Thoughts | Too Much To Read | The Rant Vault | Quotes
Dev:   Projects | Ideas For Dev | Nerdkill | Rig | Hint

Xeres is an audio/video/chat application.
Site License And Disclaimer as well as contact information are available here.

$Id: Xeres.izu,v 1.26 2006-05-09 15:58:19 ralf Exp $

Quick Links: Milestone 1, Milestone 2


1. Milestone 2

1.1. Goal

20051023

The main project goal is to produce an ...


1.2. Design & Spec


1.3. Plan

To be done (by decreasing priority):

2005xxxx [1.N] Classes: xx

Finished (by decreasing date):

2005xxxx [1.F] xx
20051023 [1.F] This file


1.4. Notes

(Notes are given in chronological order.)


«»  2005/10/23 «» Overview  «»

This is an extension of the Xeres project with a twist.

Xeres' goal was to understand how to create an audio/video chat application.

Xeres II's goal is to create a remote audio/video viewer. In a way, think of it as an audio/video chat application where the source is not necessarily a webcam and not necessary live. It may be a webcam, or it may simply be recorded video messages.

This uses a standard client/server approach:

Some more descriptive use-cases:

Some implementation or technical points:


«»  2005/10/23 «» Methodology  «»

For this project, I'd like to change a bit my methodology.

Rapid Application Development clearly doesn't work when the project remains untouched for several months.

(Update 20051112: As a result I wrote Ideas For Dev.)


«»  2005/11/08 «» Xeres, Rivet and Rig  «»

I finally understand what I was looking for: conceptually Xeres, Rivet and Rig should be separate components of a bigger project. And that bigger project is exactly Xeres II.

Here's the idea:

In other words: the viewer part of the client is Rivet, the video exchange part is Xeres and the server/web page mode is Rig II.

Some misc details:

To act as a webcam:

Project wise, it makes sense to keep 3 separate projects:

There should be some dependencies between the projects:

Still it's a good idea to keep the functionality separated in different projects to minimize interface pollution, especially since I will probably implement different parts using different languages or frameworks:

More comments & details to follow. This is just a rough idea anyway, it should be clearly and fully specified before anything happens.


2. Milestone 1

2.1. Goal

20050129

The main project goal is to produce an audio/video/text chat application.

Minimalist that is. Or more exactly, an experiment regarding the design and the performance that one could expect with minor knowledge in real time compression.

"Why?" will you ask. Good question. First, because I'm disappointed by the various video conferencing software I've used. And then mostly "because I can" or more exactly "can I really do it?" That's the experimental part of the project: I'd like to see what is doable nowadays with "standard" algorithms available as open source and easy to use.
For example
OpenH323 does all that but it does not apply. It's far from easy to use imho and after having tried to use their OpenPhone demo app for several months with the GNU Gatekeeper I'm disappointed a lot. (Note that here I am not talking about the OpenPhone project, just the application demo that uses OpenH323.)

Don't get me wrong. I do like OpenH323 a lot and I am not even trying to redo the same thing. I can't beat an H.264 codec in my spare time.
The experiment part is: do I need to?
Could a naive implementation simply encode video using JPEG 2000 and a standard GSM codec (or other audio codec) using only free or open libraries?

The target here is very specific as you can see in the notes below. I'm targeting a 160 kbps bandwidth with a 160 ms ping time, one client behind a NAT/firewall and the other directly on the net.

20050331

I need to expand on the goal of this project: building the application I have in mind is one part of the project, yet I would say that's almost the smallest part of the motivation. The real motivation lies in how and with what.
The first how is how would I implement it? The structure design and the ability to actually implement the design as preconceived or not is in fact more important for me that the resulting product of the implementation. How do I want to design it, how does that design hold in real life, and how does the design gets fixed/adapted to meet the goal?
The second with what means what existing open source projects, libraries or material can help serve as a foundation? For example can I find existing libraries to handle video acquisition, encoding, decoding, etc.? Do I have to recode my own GSM codec or can I legally use an existing one in a GPL context?

2.2. Description

2.3. Milestones

20050130

Milestones:

2.3.0 M0: Initial prototype

The goal of M0 is to explore the architecture described in the notes. Each module & class should be as minimal as needed. Some parameters may be fixed (destination IP, etc.) and the user interface will be extremely minimal.

There are three phases in this milestone:

It is also important to see that since the workflow is symmetrical only one workflow direction needs to be implemented in the test bed.

The application's user interface written here has nothing to do with the real end-user interface of a "chat application".


2.4. Plan

2.1. M0: Initial prototype

To be done (by decreasing priority):

20050322 [1.N] Classes: RReceiverThread, create class

20050326 [1.N] Classes: RRecorderVideo, limit frame rate

20050130 [1.N] Classes: RICodec, create interface with Encode/Decode methods
20050130 [1.N] Classes: RCodecJpeg, Encode, Decode (full picture)
20050315 [2.N] Classes: RCodecJpeg, Encode, Decode use sub-blocks

20050130 [1.N] Classes: RInputMixer, create class
20050315 [1.N] Classes: RInputMixer, initial impl simply takes packets and forward to network
20050130 [1.N] Classes: ROutputDemux, create class
20050315 [1.N] Classes: ROutputDemux, initial impl simply takes packets from receiver

20050130 [1.N] Classes: RNetworkSender, create class
20050130 [1.N] Classes: RNetworkReceiver, create class
20050315 [1.N] Classes: RNetworkSender/Receiver, initiate connection
20050315 [1.N] Classes: RNetworkSender/Receiver, exchange packets

20050315 [1.N] Classes: RInputMixer, better impl splits packets match network sender requested size
20050315 [1.N] Classes: ROutputDemux, recombines splitted packets

20050315 [1.N] Classes: RNetworkSender report back to InputMixer
20050315 [1.N] Classes: RInputMixer, ability to drop packets or change quality

20050130 [1.N] Classes: RInputMixer, get data queues from several recorders and mix them
20050130 [1.N] Classes: ROutputDemux, demux data from mixer into several data queues
20050130 [1.N] Classes: ROutputDemux, feed data to players
20050130 [2.N] TestApp: Display meaningfull bandwidth stats

20050315 [1.N] Planning: Complete M0 initial task list (step 2)

20050130 [2.N] Classes: RRecorderText: create class, attach to UI
20050130 [2.N] TestApp: Plug to text field, implement start/stop
20050130 [2.N] Classes: RRecorderText, implement start, stop
20050130 [2.N] Classes: RRecorderText, capture text in data queue

20050130 [2.N] Classes: RPlayerText: create class, attach to main UI
20050130 [2.N] Classes: RPlayerText: get data buffer from RRecorderText
20050130 [2.N] Classes: RPlayerText: display data buffer

20050130 [2.N] Classes: RRecorderSound: create class, implement device enumeration
20050130 [2.N] TestApp: Use device enumeration, pref selection, implement start/stop
20050130 [2.N] Classes: RRecorderSound, implement start, stop
20050130 [2.N] Classes: RRecorderSound, capture images in data queue
20050130 [2.N] TestApp: Display sound level from RRecorderSound data queue and drop queue

20050130 [2.N] Classes: RPlayerSound: create class,, implement device enumeration
20050130 [2.N] TestApp: Use device enumeration, pref selection, implement start/stop
20050130 [2.N] Classes: RPlayerSound: get data buffer from RPlayerSound
20050130 [2.N] Classes: RPlayerSound: output data buffer

20050130 [2.N] Classes: RCodecGsm, Encode, Decode

20050130 [3.N] Daemon: Create skeleton daemon in Python

20050306 [4.N] Classes: RRecorderVideo, handle device changing when already started
20050306 [4.N] Classes: RIRecorder, need an enumerate/set Resolution functionnality.
20050306 [4.N] Classes: RRecorderVideo, implement enumarete/set Resolution.
20050306 [4.N] TestApp: Enumerate and get/set video resolution in prefs.

20050130 [5.N] Classes: RCodecJpeg, Sub-block based encoding
20050130 [5.N] Classes: RCodecJpeg2k, Encode, Decode
20050130 [5.N] Classes: RCodecLpc10, Encode, Decode
20050130 [6.N] TestApp: Ability to choose codec in prefs

Finished (by decreasing date):

20050327 [1.F] TestApp: Display frame rate
20050326 [1.F] Classes: RFreqCounter in Utils
20050325 [1.F] Classes: RDevFactory, TestApp: connect RRecorderVideo to RPlayerVideo local
20050325 [1.F] Classes: RPlayerVideo: async loop, display data buffer for local
20050325 [1.F] Classes: RPlayerVideo: create class, attach to main UI

20050325 [2.F] TestApp: Enable/disable audio/video depending on device availability
20050324 [1.F] Classes: RPlayerVideo: create class (skeleton + unit file)
20050324 [1.F] TestApp: Testing with smaller image size: 176x144.
20050323 [1.F] TestApp: Use RSenderBase in existing test app
20050322 [1.F] Classes: RSenderBase, create classes
20050322 [1.F] Classes: RISender, RIReceiver, create interfaces
20050322 [1.F] Classes: Moved RBuffer to Utils. Added RIBuffer interface.

20050320 [2.F] Classes: Adding inline doc comments for all classes and public methods
20050320 [2.F] Framework: Updated to use NUnit 2.2
20050315 [1.F] Planning: Complete M0 initial task list (step 1)
20050310 [2.F] TestApp: Recorder encoding as Jpeg in RBufferVideo and MainApp display Jpeg buffers
20050309 [2.F] Classes: Added RBufferVideo.FromBitmap
20050307 [1.F] TestApp: Fixed loading window size to work with auto resize window.

20050306 [1.F] TestApp: Automatic video preview resize. Auto resize window.
20050306 [1.F] TestApp: Display recorded images from RRecorderVideo data queue
20050306 [1.F] Classes: RRecorderVideo, capture images in data queue
20050306 [1.F] Classes: RRecorderVideo, implement start, stop
20050306 [1.F] TestApp: Updating preferences for video recorder device
20050306 [1.F] TestApp: Added preference window
20050306 [1.F] Classes: Added RDeviceFactory, fixed name RIEncoder as RIRecorder
20050306 [1.F] TestApp: Use device enumeration, pref selection, implement start/stop
20050306 [1.F] Classes: RRecorderVideo: create class, implement device enumeration

20050227 [1.F] Classes: RIRecorder, RIPlayer, create interfaces
20050227 [1.F] Classes: RBufferVideo, create class

20050221 [1.F] Classes: RBuffer, create class

20050220 [1.F] Classes: RDeviceInfo, create class
20050220 [1.F] Misc:    Add to CVS
20050220 [1.F] Project: Create skeleton project for Network module
20050220 [1.F] Project: Create skeleton project for Codec module
20050220 [1.F] Project: Create skeleton project for Mixer module
20050220 [1.F] Project: Create skeleton project for Device module

20050220 [1.F] TestApp: Add logging display
20050220 [1.F] TestApp: Add preferences storage
20050220 [1.F] TestApp: Implement placeholder start/stop/pause/quit
20050220 [1.F] TestApp: Implement placeholders for videos, sound level, text, stats, log
20050220 [1.F] Project: Create skeleton project for test bed application

20050130 [1.F] Planning: M0 defined, initial task list
20041221 [1.F] This file

2.5. Milestone 1

2.6. Milestone 2


2.7. Notes

(Notes are given in chronological order.)


«»  2005/01/29 «» Overview  «»

There are various components and they should all be as much independant as possible. Components are grouped logically in modules. Dependencies between modules should be made only through abstract interfaces.

The various components are:

The workflow should look like this:

+----------+    +-------+    +--------+               +----------+    +-------+    +--------+
| Recorder | => | Mixer | => | Sender | => ( net ) => | Receiver | => | Demux | => | Player |
+----------+    +-------+    +--------+               +----------+    +-------+    +--------+
                  ^   ^        |                                        ^
     +---------+  |   |        |                           +---------+  |
     | Encoder |--+   +---<<---+                           | Decoder |--+
     +---------+      (feedback)                           +---------+

Modules, classes, etc.:

Implementation could logically in steps be organized as follow:

Note that the encoders shouldn't care whether they are invoked from the mixer or to test processing of the recorder's buffers. That is they just need to operate on singular buffers or streams of buffers. Ideally each buffer should be encoded separately, that is maybe the encoder needs the spatial information from the source but the decoder should still work if some buffers are missing.

Buffers need some metadata. For example for the video stream rather than transmit whole images it may be more efficient for the encoder to transmit only what changed, by transmitting just a number of macroblocks f.ex. Also the meta data should contain the encoder's name so that the correct decoder can be selected. The meta data may be a hashtable with a minimal set of types accepted and serialized by the network class (well actually by the RBuffer itself).

For testing purposes (i.e. NUnits), network related functions should always use a generic stream class when possible. This way the sender/receiver pair can be tested using string buffers.

Note that the workflow only goes in one direction (recoder towards player.) In a two-way communication application, two of these workflows will run in parallel in opposite directions.


«»  2005/01/29 «» Network Proxy  «»

Nothing has been said before about network session initiation nor dealing with NATs/firewalls.

The initial target is to work with this configuration:

DYN cannot access to FIX since it's not the same network so it can't open any connection -- there's no port forwarding. That means the FIX client should open the connections. But it can't unless it knows DYN's address.

So we still need "something" on the NAT/firewall: a registering proxy. When DYN starts, it connects to the proxy to register it's IP and a name. When FIX starts, it also connects to the proxy to register it's IP and name. From the end point address, the proxy can determine if both end points are on the same network or not. Each client keeps its connection with the proxy open (with a heart beat).

To initiate a session, both clients ask the proxy, not directly the other end point.

If DYN wants to initiate, the proxy tells FIX to open the connection towards DYN.
If FIX wants to initiate, the proxy tells FIX to open the connection towards DYN.
The bottom line is that the proxy always tell the local network's client to open the connection, which means the NAT will automatically send feedback packets accordingly.

In the case where both clients have dynamics or static IPs directly on the Internet, then the proxy will reply to the client that started first with the other end point address.

At first, the proxy does NOT forward data. That means both clients behind NATs/firewalls will not work.

This proxy will be a simple Python network daemon.

In a two-way communication session, two connections are needed. One is used by FIX to send data to DYN, the other is used by DYN to send data to FIX. What happens is that FIX will open both connections. On the one it uses to receive data from DYN it will essentially do a blocking read. Heart beats should be exchanged if there's no data to transmit (unlikely).

For a start the two connections are done using TCP streams. These streams are two-way too, the replies are used to provide the feedback.


«»  2005/02/20 «» Xeres  «»

Project name changed from FaceAFace to Xeres.


«»  2005/02/27 «» Implementation Details  «»

Here are several implementation details I wrote on the back of an envelope last week. Most of them seem almost obvious, I just need to clearly write them down here for later reference.

First regarding video compression, the idea is to split an image in blocks.
This is of course similar to what MPEG and H.261-H.263 codecs do: they split an image in macro blocks, encode them separately using DCT, perform more or less vector prediction/compensation on these and transmit only a subset of the DCT coefficient to match the bandwidth.
I'm not going there. That is I am not reimplementing H.261 -- if I really wanted to do that I would just use an existing codec from the
OpenH323 project for example.

Instead, the naive implementation I foresee will split an image in blocks and encode them using JPEG or JPEG 2000 (a.k.a. J2k) and play with the image quality.

The encoder will have to use the temporal information, that is determine which blocks have changed from one image to another. Going further, it seems like a good idea to transmit those blocks we know generally change often. So not only each block should be diffed with the one from the previous image and given a "note" (probably a chroma delta), but moreover the encoder should keep an average of this block value with a window. Blocks that have the higher average will be transmitted more often.

Note that the first implementation is use a TCP stream so technically if both side are connected, there can be no data loss. This may change though, as it makes sense for this kind of video to be transmitted as UDP packets. Out of order packets can be trashed if too old or used if they paint a block which has not been painted by a more recent block yet. In this case the decoder simply need to keep one time stamp per block indicating the last time it was refreshed. Also in this case it would make sense for the receiver to give regular feedback to the emitter saying how many packets were received in a given time frame. This would allow the receiver to quantify packet losses. The feedback should also incorporate data that allows to measure the transmit delay. This feedback is more for stats purposes than controlling the bandwidth as it doesn't say anything useful on congestion per see.

The other implementation detail I was thinking of is that the various classes such as encoder/decoders that need to do something UI-related shouldn't access the UI directly. Being a .Net/WinForm implementation, the UI may not be running in the same thread and thus should not be accessed directly. The typical workaround is to use an asynchronous delegate (i.e. an asynchronous thread-safe event call.)

Finally as I was coding RBuffer recently, I put directly in the class a time stamp member and rectangle bounds member. This contradicts my earlier note that a buffer should contain data and metadata. The metadata would be an hash table maybe with some constant integer keys for frequently used metadata and optional string keys for custom class-specific metadata.

The obvious solution is that I want an RBuffer, which is generic, and I want a specific RBufferVideo which has a time stamp and a bounds rectangle.

Also I was initially assuming the metadata would be only strings, thus serializing and deserializing a rectangle from it each time I needed to access it would be expensive. This is a mistake. What I should do is clearly document the hash table as only containing a fixed subset of types, for example int, double, string and rectangle.

Also the base class RBuffer needs a serializer/deserializer for network stream. As I don't know exactly what I will want here, I'll just create dummy method throwing a NotImplementedYet exception and then implement them later as needed.


«»  2005/03/06 «» Implementation Details  «»

Just implemented the basis of RRecorderVideo device enumeration. The RIRecorder interface also gained a Logger property which is set by the main window. Currently the logger is the main window. There will be a need to introduce an extra indirection if and when I'll need to append stuff to the log from a separate thread. I plan to do that by create an RAsyncLog which implements the RILog interface and simply dispatches log messages to a given target using the asynchronous delegate technique.

I also introduced a RDeviceFactory class in the Device module. This is not really a factory in the traditionnal sense of it but I lack a better word for it. Basically it will host instances of the recorders and players instead of having them created and managed by the main window. The idea is that the main window should deal mostly with UI stuff and it will change a lot in the future so the less code I place there, the better. This factory hosts the instances of the recorders and players, it initializess them and also provides a place for utility methods that act on all recorders and players at once.

Another advantage of the factory is that it forces the UI to access the recorders or players only thru their RIRecorder or RIPlayer interface and thus minimize potential abuse such as a direct call to method not in the interfaces.


«»  2005/03/06 «» Plan Changes  «»

I'm going to reorganize the plan as follow:

I'll probably integrate SNE2 directly in Xeres and transform it to use the full .Net framework (the current SNE2 is a .Net Compact Framework library, which is good for Pocket PC compatibility but is too limited for my taste.)

Then once the canvas is ready I can expand it:


«»  2005/03/09 «» Barking at the wrong tree  «»

These last days, I finished the first version of the RRecorderVideo that grabs images. As a test, it enqueued System.Drawing.Bitmap instances instead of RBufferVideo in its data queue. Then on a timer the main app was polling that queue and displaying the images.

The second phase of that test was to enqueue RBufferVideo instances. So naturally I started adding code that would do a Bitmap.LockBits, read the bits and store then as 24-bpp RGB in the buffer. This directly in the frame received call back. Then I realized this was not convenient for debugging so I added a new RBufferVideo.FromBitmap() method, which had also the advantage I could easily write a NUnit test for it. Eventually I realized it would be more efficient to store 32bpp RGBA data with an empty alpha channel to take advantage of the 4-byte alignment.

Then this morning I realize how this is all wrong.

I explain: in my traditional C-like kind of thinking, a bitmap is just a bunch of bytes, that is a pointer and some length and it's up to me to deal with the format, the components order and what they all represent. So my initial reflex was to get the bits from a Bitmap object and manipulate them.
Yet what would that manipulation be? My goal is not to do a DCT-like encoding manually. I'm not trying to reinvent JPEG nor MPEG here. Instead what I want for my first prototype is to manipulate bitmaps "as-is". That means I just need to keep the Bitmap object from .Net, then encoding it as JPEG using the internal methods and finally stream that JPEG data file directly in the RBufferVideo.
The second part of the encoder/decoder will be to work with the same principle but on a sub-block basis. Same I don't even have to try to get the bits of the bitmap, I just need to use the JPEG encoder/decoder available right in .Net.


«»  2005/03/10 «» Preliminary JPEG Test  «»

I just tried a quick test/hack with JPEG encoding.
Currently the hack is that RBufferVideo as a method which received a Bitmap, encodes it as JPEG and store the stream. Then there's a reverse function which given a data stream decodes the JPEG and create an Image object.
The video recorder directly places the JPEG buffers in its queue and the MainApp grabs them on a timer.
The test shows that the bandwidth is way over the 160 kilobit limits, no big surprise here.

Now I need to think of where to go and change my planning accordingly. The currently planning is to implement the video player part. That sounds good.

The other part is that I need to put the encoding/decoding part in the Codec module so I may need to do that before the video player.

Also I need to make a fake Network module which is a local loop with a rate limiter. Finally I need to add the Mixer which will use feedback from the network to drop packets and control the quality/frame rate to match the bandwith.

I also need a workflow model. Currently the recorder has its own queue, which it fills (and discards older packets to keep 2 of them only). On the other hand the main loop has a timer and reads this queue periodically.
Instead I prefer an event-based model. The recorder enqueues a packet and fires an asynchronous event. When starting, the main application creates the recorder and the codec's encoder and registers the encoder as receiving the event from the recorder. It should also tell the codec where to fetch the data from.
Basically I can use two interfaces: one interface that publishes a data queue and can register a delegate event, and another interface that publishes an event receiver and can told which object has the data queue.
This way only the main application contructs the graph and each module knows nothing of the others.

So basically I foresee, more or less in that order:

There are actually two instances of the video player:

The updated workflow is the following:

+----------+        +-------+    +--------+               +----------+    +-------+
| Recorder |        | Mixer | => | Sender | => ( net ) => | Receiver | => | Demux |
+----------+        +-------+    +--------+               +----------+    +-------+
     |                ^   ^        |                                          |
     |   +---------+  |   |        |                             +---------+  |
     +-->| Encoder |--+   +---<<---+                             | Decoder |<-+
         +---------+      (feedback)                             +---------+
              |                                                       |
              v                                                       v
         +--------+                                              +--------+
         | Player |                                              | Player |
         +--------+                                              +--------+
           (local)                                                (remote)

Later the improvement could come in this order:

Note that at this point nothing is said about having feedback from the receiver back to the sender.


«»  2005/03/20 «» Modules  «»

I realize that I did not say anything about which module would hold the RISender and RIReceiver interfaces. These are going to be used by classes in the devices, mixer, codecs and network modules. As I want to reduce inter-dependencies, I need to put them elsewhere and I think the Utils module is jut fine for that.

Currently the Utils module contains the RILog interface and the RPref class.

Basically I want the various modules to form the following graph:

                Utils
                  |
      +-------+-------+--------+
      |       |       |        |
      v       v       v        v
   Device   Mixer   Codec   Network
      |       |       |        |
      +-------+-------+--------+
                  |
       XeresLib   |
          |       |
          +-------+
                  |
                  v
               XeresApp

Minimizing dependencies can only be beneficial.

Making sure there are no cross-references is necessary. I've learn when trying to modularize Nerdkill C# that this was one the cause of some weird link issue I had with Visual.


«»  2005/03/22 «» Modules  «»

There's was an obvious missing link in the module dependency graph:

          Utils
            |
    +-------+------------------+
    |       |                  |
    |       |                  v
    |       |      +-------+-------+--------+
    |       |      |       |       |        |
    v       |      v       v       v        v
 XeresLib   |   Device   Mixer   Codec   Network
    |       |      |       |       |        |
    |       |      +-------+-------+--------+
    |       |                  |
    |       |                  |
    |       |                  |
    +-------+------------------+
            |
            v
         XeresApp
            |
            v
      (TestsConsole)

This is more accurate as it depicts that the utils module is used by all other modules. The direct implication is that classes in the utils module must be generic and never need to use anything even remotely app-specific.
Another way to think of it: the utils module is actually copied from the AppSkeleton project when creating a new application. Thus all changes made to this module can potentially be back ported to the AppSkeleton project and consequently must not depend on anything app-specific.

TestsConsole is the console project that performs the NUnit testing. Building TestsConsole also runs the test suite as a post-build action, which has the wanted side-effect of stopping the build in case of error. TestsConsole does not technically reference XeresApp. The dependency is just created to force the tests to be performed as the last build step.


«»  2005/03/22 «» NUnit Testing  «»

I need to add a blurb on the NUnit testing.

The previous comment outlined when and how the NUnit tests are run: the TestsConsole sub-project is built as the last test of the application build. The post-process of the projects basically runs itself. If a test fails, the build fails. This implies a 0% fault tolerance: all tests must pass. Always.

Due to the way VS.Net works, there's a subtle effect: if the TestsConsole fails to build (generally because one of the other modules failed to build), then the old tests are run and they may silently succeed. I could actually fix that by removing the tests console exe in the pre-build phase, yet that would result in this exe to be build every time... Not a good solution. Another way to look at it is that the point is moot: if any of the projects fails to build, the build failed. Period. So just look at the log to figure where it did. If there's a build error, it is irrelevant to know whether the tests passed or not.

For simplicity purposes, I choose to have one TestsConsole project containing all the tests. The alternative was to have one test project per module, that is something such as TestUtils, TestDevice, TestCodec, etc. Since I do not see a point in running just a sub part of the tests, this would just add more projects to maintain for nothing.
OTOH what I did is that each module has its own namespace in the test suite, that is TestsConsole.Utils, TestsConsole.Devices, etc. This means it will be easy to back port the TestsConsole.Utils suite to the AppSkeleton project later as needed. It also means that if I ever wanted to run a sub part of the tests I could create a suite based on the namespace.

Also, right now as far as testing goes I do not test anything from XeresApp. There is actually no dependency between XeresApp and TestsConsole except for the need of controlling the build order. My philosophy is to put all UI-related code in XeresApp and anything remotely logic-related in XeresLib. I do perform testing on XeresLib though.
Part of the problem is that I fail to see how to test anything UI-related. I do not think that a unit test can properly check that a system call actually opened a window or that a widget displays the proper way. There are a subset of action/reaction tests that could be performed, such as simulating a click on a button by calling its callback function and checking whether some other UI item has been checked or activated... only internal properties could be validated and there's no guarantee that the visual output matches the internal state.
UI-related tests should be handled by a whole different testing strategy IMHO and I don't have any plans nor idea for such right now. Something like QA with sequential test plans comes to mind.

Implementation wise, I finally found a way to adapt the test-first motto to something more realistic that works for me.

XP defines unit testing and test-first as part of the same process. The underlying concept is to write a test first, see it fail, then create the code that will make the test pass. Although I agree with the underlying concept, it doesn't work for me. I want the tools to help me, not cut on my productivity and one of the greatest helps I get is auto-completion in the IDE. If I create a test that operates on some method or a class that I haven't created yet, I do not have any benefit from auto completion. I'm not better than working with vim or Emacs and this is not what I want. Also, as explained before if the test fails to build, VS.Net will run the previous version of the tests (a limitation due to the way I set the tests to run.) The only way to do test-first in my mind is to create a minimalist test then the minimalist code that allows it to compile and then see it fail. This is from the book. Well I decided then it wouldn't hurt to do it the reverse way: create the minimalist class first, compile it so that the IDE can grab the metadata needed to perform completion -- this is a test in itself, that is it compiles. Then create the minimalist test and see it compile. Then run it and see it fail.
Once I have this basic framework, I can actually write test-first by expanding the test, see it fail and expand the code. The difference with the original concept is merely an implementation detail IMHO.

Now I expand this by actually not using test-first at all. I actually find it easier to work in code-and-test-first. Once I have an underlying framework where both a class and its test unit exists (so that auto completion can work in the IDE), I actually like to sketch the outline of the class methods first, maybe populate part of the code as I see fit, then write the matching test and fire it all to see what happens. I find it easier to write the code and the matching test in parallel; one explicits the internal logic and the other serves as a use case that refines the design. For example maybe I though some method should return a string yet when I want to use it, it may be more natural to return the string as an out parameter and just a boolean indicating whether it worked or not. Or it maybe more natural to return the string and a null pointer on failure. Or throw an exception. Depending on the task at hand, that is how the method is to be used and its implementation, writing the test case present a good opportunity to statically check if the method makes sense.

On the other hand, I clearly see the benefit of having unit tests. Even in such a small project as Xeres I had the opportunity to catch implementation bugs before even launching the application, or experience refactoring induced bugs (in that case it was a standard copy/paste bug.)


«»  2005/03/22 «» Sender and Receiver  «»

The RISender interface describes an entity that can produce data in the form of RBuffer instances.

The sender doesn't actually "send" any data. Instead it provides a buffer queue and can send a notification to am RIReceiver when new data is available. The receiver will be responsible for extracting the data from the queue.

The only mandatory rule here is that once a receiver received a notification, it must be able to fetch at least one data buffer from the sender queue.

The sender does not need to have as many queued data buffers as notifications sent. The notification merely indicate that at least one data buffer is available. This way, a receiver may choose to either pool the sender's queue repeatedly, or on the other hand it may choose to drop notification events if it doesn't need data, or to just keep a note that some data can be fetched at a later time.

Consequently, depending on the nature of the queue being managed, it may be up to the sender to remove obsolete queued buffers that have not been used yet.

RSenderBase is a default base implementation of RISender. It provides a default implementation that has a buffer queue, can add a new buffer, can remove the first available buffer, can purge old buffers (when adding a new one) and performs the event dispatch.

The RIReceiver interface describes the counterpart to RISender, that is an entity that can receive notifications when a sender has new buffers available.

RIReceiver derived classes do not actually "receive" buffers. They are only notified when a sender produces new buffers. The implementation here is responsible for grabbing the buffers when appropriate with the only guaranteed rule that at least one buffer will be made available. Implementations should also be careful in locking the sender's queue when dequeuing buffers.

Here are various strategies to implement the callback:

Note that overall the sender/receiver pattern used here could more appropriately be called a producer/consumer pattern. The difference is subtle yet both give a clear idea of the underlying workflow.


«»  2005/03/27 «» Performances  «»

Now that I have the first part of the my workflow (i.e. Recorder -> Player, 2 out of 9 items), I'm looking at performances and at first I got scared. I run the application and I get a mere 5 frames per second! Keep in mind that at this point the application does more or less:

The test application that comes with VideoCapture.Net does the same at 15 fps so why do I get only 5 here??!

Well first I was running under VS.Net's debugger in debug mode. By running it out of the debugger I get 12-13 fps, and by using the release version I get one more fps. Then if I comment out the JPEG encoding/decoding and disable the display I get one more fps so I effectively get almost 15 fps again.

So overall it's not at bad as it sounded at first. 15 fps max is OK considering that the target is for an internet-based web chat, so I'll be happy if I can transmit 1-5 fps overall.

Then again 100% CPU on an Athlon 64 3000+ to get 15 fps is not extremely fast either...
Since my workflow is nowhere near completion, the yet-to-be-done processing will need some non negligible CPU time thus it is not acceptable to use 100% CPU right now.

Looking with the excellent Application Profiler, I see that the application spends almost 80% of its time in a new byte call when receiving frames. That means there's a lot of room for optimization, in two different ways.

First I want to insert a crude rate limiter. If I effectively limit the frame rate to 5 fps, obviously I'll save whatever time was used by these extra 10 fps. Ideally the application should only use 33% CPU if the frame rate is limited to one third, right? Hmmm ;-)

Second there's some obvious optimization that can be done in VideoCapture.Net for my specific case.
Currently the VideoCaptureDevice::OnSample callback does this:

So there are 3 objects allocated and two data copies done each time. None of the object allocations are actually useful and only one data copy is actually needed.
The EventArgs class could be instantiated once first when the device is enabled and then reused for each callback call. Since the code makes sure not to reenter the callback if already in it, it merely needs to make sure the EventArgs data is not changed while the callback is being called. Then the callback itself does not need to re-allocate the bitmap nor the buffer each time. They can be allocated once with the EventArgs and reused as long as it is made clear by the usage semantics that the bitmap should not be disposed by the client callback. Then again depending on the client's usage, only one of the raw data byte array or the bitmap object are really needed.
So we'll see. I need to reinstall the DirectX 9 SDK and if I manage to recompile VideoCapture.Net I'll probably tweak it for my own needs.


«»  2005/04/30 «» Lightweight notification interface  «»

Earlier I was describing the sender and receiver interface which is used to implement the buffer flow management.

The purpose of this interface is to pass buffers around. The idea is that buffers may potentially be heavy and one wants to limit as much as possible how many buffers get created and moved around.
Since the implementation is .Net/C#, the only thing that really gets transfered between a sender and a receiver is a reference, or pointer.
Yet somehow from my point of view the semantic of a buffer is that it is something "expensive" to manipulate and the semantic of the transfer is more about transfering ownership than real data.
So the interface is build to minimize what gets transfered: the sender owns whatever buffers it makes available till a receiver grabs them.

From an implementation point of view, this is all but a lightweight interface. The sender will probably be implemented in most case as an object instance running in its own thread with its own message queue. The receiver will generally be another instance running in its own thread too.

One the other hand, for a completly different target, I may want a secondary interface that achieves a similar result yet with different objectives:

In this interface, the focus is more on passing a message to target than transmitting something. The message will generally have data associated, yet it may not.

(..to be continued..)



Site License

Creative Commons License
This work is licensed by Raphaël Moll under a Creative Commons License.

Options
Color Theme: Gray  | Blue  | Black | Sand  | Khaki  | Egg  | None

Web ralf.alfray.com Powered by Google

Display Izumi & PHP Credits

Stats
541 accesses, 1 access from 38.103.63.58
Visited 30 times by Google, last 2008/11/12 20:09
Visited 74 times by Yahoo!, last 2008/11/13 04:52
Visited 16 times by Teoma, last 2008/11/02 12:30
Visited 25 times by MSN, last 2008/11/08 19:46

< Generated in 0.95 seconds the 11/19/2008, 04:09 PM by Izumi 1.1.4 >