|
|
Index: Home | What Is Izumi | Misc Links | Random Thoughts | Too Much To Read | The Rant Vault | Quotes Dev: Projects | Ideas For Dev | Nerdkill | Rig | Hint
$Id: Xeres.izu,v 1.26 2006-05-09 15:58:19 ralf Exp $
Quick Links: Milestone 1, Milestone 2
20051023
The main project goal is to produce an ...
To be done (by decreasing priority):
Finished (by decreasing date):
(Notes are given in chronological order.)
This is an extension of the Xeres project with a twist.
Xeres' goal was to understand how to create an audio/video chat
application.
Xeres II's goal is to create a remote audio/video viewer.
In a way, think of it as an audio/video chat application where the source is
not necessarily a webcam and not necessary live. It may be a webcam, or it
may simply be recorded video messages.
This uses a standard client/server approach:
Some more descriptive use-cases:
Some implementation or technical points:
For this project, I'd like to change a bit my methodology.
Rapid Application Development clearly doesn't work when the project remains
untouched for several months.
(Update 20051112: As a result I wrote Ideas For Dev.)
I finally understand what I was looking for: conceptually Xeres, Rivet and Rig
should be separate components of a bigger project. And that bigger project is
exactly Xeres II.
Here's the idea:
In other words: the viewer part of the client is Rivet, the video exchange part is Xeres
and the server/web page mode is Rig II.
Some misc details:
To act as a webcam:
Project wise, it makes sense to keep 3 separate projects:
There should be some dependencies between the projects:
More comments & details to follow.
This is just a rough idea anyway, it should be clearly and fully specified before anything happens.
20050129
The main project goal is to produce an audio/video/text chat application.
Minimalist that is. Or more exactly, an experiment regarding the
design and the performance that one could expect with minor knowledge
in real time compression.
"Why?" will you ask. Good question. First, because I'm disappointed by the
various video conferencing software I've used. And then mostly "because I can"
or more exactly "can I really do it?" That's the experimental part of the
project: I'd like to see what is doable nowadays with "standard" algorithms
available as open source and easy to use.
Don't get me wrong. I do like OpenH323 a lot and I am
not even trying to redo the same thing. I can't beat an H.264 codec in my spare time.
The target here is very specific as you can see in the notes below.
I'm targeting a 160 kbps bandwidth with a 160 ms ping time,
one client behind a NAT/firewall and the other directly on the net.
20050331
I need to expand on the goal of this project:
building the application I have in mind is one part of the project, yet I
would say that's almost the smallest part of the motivation.
The real motivation lies in how and with what.
20050130
Milestones:
2.3.0 M0: Initial prototype
The goal of M0 is to explore the architecture described in the notes.
Each module & class should be as minimal as needed.
Some parameters may be fixed (destination IP, etc.)
and the user interface will be extremely minimal.
There are three phases in this milestone:
It is also important to see that since the workflow is symmetrical only
one workflow direction needs to be implemented in the test bed.
The application's user interface written here has nothing to do with the
real end-user interface of a "chat application".
2.1. M0: Initial prototype
To be done (by decreasing priority):
Finished (by decreasing date):
(Notes are given in chronological order.)
There are various components and they should all be as much independant as
possible. Components are grouped logically in modules. Dependencies between
modules should be made only through abstract interfaces.
The various components are:
The workflow should look like this:
Modules, classes, etc.:
Implementation could logically in steps be organized as follow:
Note that the encoders shouldn't care whether they are invoked from the mixer
or to test processing of the recorder's buffers. That is they just need to operate
on singular buffers or streams of buffers. Ideally each buffer should be encoded
separately, that is maybe the encoder needs the spatial information from the
source but the decoder should still work if some buffers are missing.
Buffers need some metadata. For example for the video stream rather than
transmit whole images it may be more efficient for the encoder to transmit only
what changed, by transmitting just a number of macroblocks f.ex. Also the
meta data should contain the encoder's name so that the correct decoder can be
selected. The meta data may be a hashtable with a minimal set of types accepted
and serialized by the network class (well actually by the RBuffer itself).
For testing purposes (i.e. NUnits), network related functions should always use
a generic stream class when possible. This way the sender/receiver pair can be
tested using string buffers.
Note that the workflow only goes in one direction (recoder towards player.)
In a two-way communication application, two of these workflows will run in
parallel in opposite directions.
Nothing has been said before about network session initiation nor dealing with
NATs/firewalls.
The initial target is to work with this configuration:
DYN cannot access to FIX since it's not the same network so it can't open
any connection -- there's no port forwarding.
That means the FIX client should open the connections.
But it can't unless it knows DYN's address.
So we still need "something" on the NAT/firewall: a registering proxy.
When DYN starts, it connects to the proxy to register it's IP and a name.
When FIX starts, it also connects to the proxy to register it's IP and name.
From the end point address, the proxy can determine if both end points are
on the same network or not.
Each client keeps its connection with the proxy open (with a heart beat).
To initiate a session, both clients ask the proxy, not directly the other end
point.
If DYN wants to initiate, the proxy tells FIX to open the connection towards DYN.
In the case where both clients have dynamics or static IPs directly on the Internet,
then the proxy will reply to the client that started first with the other end point
address.
At first, the proxy does NOT forward data.
That means both clients behind NATs/firewalls will not work.
This proxy will be a simple Python network daemon.
In a two-way communication session, two connections are needed. One is used by FIX
to send data to DYN, the other is used by DYN to send data to FIX. What happens is
that FIX will open both connections. On the one it uses to receive data from DYN
it will essentially do a blocking read. Heart beats should be exchanged
if there's no data to transmit (unlikely).
For a start the two connections are done using TCP streams.
These streams are two-way too, the replies are used to provide the feedback.
Project name changed from FaceAFace to Xeres.
Here are several implementation details I wrote on the back of an envelope
last week. Most of them seem almost obvious, I just need to clearly write them
down here for later reference.
First regarding video compression, the idea is to split an image in blocks.
Instead, the naive implementation I foresee will split an image in blocks
and encode them using JPEG or JPEG 2000 (a.k.a. J2k) and play with the image
quality.
The encoder will have to use the temporal information, that is determine which
blocks have changed from one image to another. Going further, it seems like a
good idea to transmit those blocks we know generally change often. So not only
each block should be diffed with the one from the previous image and given
a "note" (probably a chroma delta), but moreover the encoder should keep an
average of this block value with a window. Blocks that have the higher average
will be transmitted more often.
Note that the first implementation is use a TCP stream so technically if both
side are connected, there can be no data loss. This may change though, as it
makes sense for this kind of video to be transmitted as UDP packets. Out of
order packets can be trashed if too old or used if they paint a block which
has not been painted by a more recent block yet. In this case the decoder simply
need to keep one time stamp per block indicating the last time it was refreshed.
Also in this case it would make sense for the receiver to give regular feedback
to the emitter saying how many packets were received in a given time frame.
This would allow the receiver to quantify packet losses.
The feedback should also incorporate data that allows to measure the transmit
delay. This feedback is more for stats purposes than controlling the bandwidth
as it doesn't say anything useful on congestion per see.
The other implementation detail I was thinking of is that the various classes
such as encoder/decoders
that need to do something UI-related shouldn't access the UI directly.
Being a .Net/WinForm implementation, the UI may not be running in the same thread
and thus should not be accessed directly. The typical workaround is to use
an asynchronous delegate
(i.e. an asynchronous thread-safe event call.)
Finally as I was coding RBuffer recently, I put directly in the class
a time stamp member and rectangle bounds member. This contradicts my earlier
note that a buffer should contain data and metadata. The metadata would be an
hash table maybe with some constant integer keys for frequently used metadata
and optional string keys for custom class-specific metadata.
The obvious solution is that I want an RBuffer, which is generic, and I want
a specific RBufferVideo which has a time stamp and a bounds rectangle.
Also I was initially assuming the metadata would be only strings, thus serializing
and deserializing a rectangle from it each time I needed to access it would be
expensive. This is a mistake. What I should do is clearly document the hash table
as only containing a fixed subset of types,
for example int, double, string and rectangle.
Also the base class RBuffer needs a serializer/deserializer for network stream.
As I don't know exactly what I will want here, I'll just create dummy method
throwing a NotImplementedYet exception and then implement them later as needed.
Just implemented the basis of RRecorderVideo device enumeration.
The RIRecorder interface also gained a Logger property which is set by the
main window. Currently the logger is the main window. There will be a need
to introduce an extra indirection if and when I'll need to append stuff to
the log from a separate thread. I plan to do that by create an RAsyncLog
which implements the RILog interface and simply dispatches log messages to
a given target using the asynchronous delegate
technique.
I also introduced a RDeviceFactory class in the Device module.
This is not really a factory in the traditionnal sense of it but I lack a better
word for it. Basically it will host instances of the recorders and players
instead of having them created and managed by the main window.
The idea is that the main window should deal mostly with UI stuff and it will
change a lot in the future so the less code I place there, the better.
This factory hosts the instances of the recorders and players, it initializess them
and also provides a place for utility methods that act on all recorders and players
at once.
Another advantage of the factory is that it forces the UI to access the recorders
or players only thru their RIRecorder or RIPlayer interface and thus minimize
potential abuse such as a direct call to method not in the interfaces.
I'm going to reorganize the plan as follow:
I'll probably integrate SNE2 directly in Xeres and transform it to use the
full .Net framework (the current SNE2 is a .Net Compact Framework library, which
is good for Pocket PC compatibility but is too limited for my taste.)
Then once the canvas is ready I can expand it:
These last days, I finished the first version of the RRecorderVideo that grabs
images. As a test, it enqueued System.Drawing.Bitmap instances instead of
RBufferVideo in its data queue. Then on a timer the main app was polling that
queue and displaying the images.
The second phase of that test was to enqueue RBufferVideo instances.
So naturally I started adding code that would do a Bitmap.LockBits, read the bits
and store then as 24-bpp RGB in the buffer. This directly in the frame received
call back. Then I realized this was not convenient for debugging so I added
a new RBufferVideo.FromBitmap() method, which had also the advantage I could easily
write a NUnit test for it. Eventually I realized it would be more efficient to
store 32bpp RGBA data with an empty alpha channel to take advantage of the 4-byte
alignment.
Then this morning I realize how this is all wrong.
I explain: in my traditional C-like kind of thinking, a bitmap is just a bunch of
bytes, that is a pointer and some length and it's up to me to deal with the format,
the components order and what they all represent. So my initial reflex was to
get the bits from a Bitmap object and manipulate them.
I just tried a quick test/hack with JPEG encoding.
Now I need to think of where to go and change my planning accordingly.
The currently planning is to implement the video player part.
That sounds good.
The other part is that I need to put the encoding/decoding part in the Codec
module so I may need to do that before the video player.
Also I need to make a fake Network module which is a local loop with a rate
limiter. Finally I need to add the Mixer which will use feedback from the network
to drop packets and control the quality/frame rate to match the bandwith.
I also need a workflow model. Currently the recorder has its own queue, which it fills
(and discards older packets to keep 2 of them only). On the other hand the main loop
has a timer and reads this queue periodically.
So basically I foresee, more or less in that order:
There are actually two instances of the video player:
The updated workflow is the following:
Later the improvement could come in this order:
Note that at this point nothing is said about having feedback from the receiver
back to the sender.
I realize that I did not say anything about which module would hold
the RISender and RIReceiver interfaces.
These are going to be used by classes in the devices, mixer, codecs and
network modules. As I want to reduce inter-dependencies, I need to put them
elsewhere and I think the Utils module is jut fine for that.
Currently the Utils module contains the RILog interface and the RPref class.
Basically I want the various modules to form the following graph:
Minimizing dependencies can only be beneficial.
Making sure there are no cross-references is necessary.
I've learn when trying to modularize Nerdkill C#
that this was one the cause of some weird link issue I had with Visual.
There's was an obvious missing link in the module dependency graph:
This is more accurate as it depicts that the utils module is used by
all other modules. The direct implication is that classes in the utils
module must be generic and never need to use anything even remotely
app-specific.
TestsConsole is the console project that performs the NUnit testing.
Building TestsConsole also runs the test suite as a post-build action,
which has the wanted side-effect of stopping the build in case of error.
TestsConsole does not technically reference XeresApp. The dependency is
just created to force the tests to be performed as the last build step.
I need to add a blurb on the NUnit testing.
The previous comment outlined when and how the NUnit tests are run: the
TestsConsole sub-project is built as the last test of the application build.
The post-process of the projects basically runs itself. If a test fails, the
build fails. This implies a 0% fault tolerance: all tests must pass. Always.
Due to the way VS.Net works, there's a subtle effect: if the TestsConsole
fails to build (generally because one of the other modules failed to build),
then the old tests are run and they may silently succeed. I could actually
fix that by removing the tests console exe in the pre-build phase, yet that
would result in this exe to be build every time... Not a good solution.
Another way to look at it is that the point is moot: if any of the projects
fails to build, the build failed. Period. So just look at the log to figure
where it did. If there's a build error, it is irrelevant to know whether
the tests passed or not.
For simplicity purposes, I choose to have one TestsConsole project containing
all the tests. The alternative was to have one test project per module,
that is something such as TestUtils, TestDevice, TestCodec, etc. Since I do not
see a point in running just a sub part of the tests, this would just add more
projects to maintain for nothing.
Also, right now as far as testing goes I do not test anything from XeresApp.
There is actually no dependency between XeresApp and TestsConsole except for
the need of controlling the build order.
My philosophy is to put all UI-related code in XeresApp and anything remotely
logic-related in XeresLib. I do perform testing on XeresLib though.
Implementation wise, I finally found a way to adapt the
test-first motto to
something more realistic that works for me.
XP defines unit testing and test-first
as part of the same process. The underlying concept is to write a test
first, see it fail, then create the code that will make the test pass.
Although I agree with the underlying concept, it doesn't work for me.
I want the tools to help me, not cut on my productivity and one of the
greatest helps I get is auto-completion in the IDE. If I create a test
that operates on some method or a class that I haven't created yet, I do not
have any benefit from auto completion. I'm not better than working with
vim or Emacs and this is not what I want. Also, as explained before if the
test fails to build, VS.Net will run the previous version of the tests (a
limitation due to the way I set the tests to run.) The only way to do
test-first in my mind is to create a minimalist test then the minimalist
code that allows it to compile and then see it fail. This is from the book.
Well I decided then it wouldn't hurt to do it the reverse way: create the
minimalist class first, compile it so that the IDE can grab the metadata
needed to perform completion -- this is a test in itself, that is it compiles.
Then create the minimalist test and see it compile. Then run it and see it fail.
Now I expand this by actually not using test-first at all.
I actually find it easier to work in code-and-test-first.
Once I have an underlying framework where both a class and its test unit
exists (so that auto completion can work in the IDE), I actually like to
sketch the outline of the class methods first, maybe populate part of the
code as I see fit, then write the matching test and fire it all to see what
happens. I find it easier to write the code and the matching test in parallel;
one explicits the internal logic and the other serves as a use case that refines
the design. For example maybe I though some method should return a string
yet when I want to use it, it may be more natural to return the string as an out
parameter and just a boolean indicating whether it worked or not. Or it maybe
more natural to return the string and a null pointer on failure. Or throw an
exception. Depending on the task at hand, that is how the method is to be used
and its implementation, writing the test case present a good opportunity to
statically check if the method makes sense.
On the other hand, I clearly see the benefit of having unit tests.
Even in such a small project as Xeres I had the opportunity to catch
implementation bugs before even launching the application, or experience
refactoring induced bugs (in that case it was a standard copy/paste bug.)
The RISender interface describes an entity that
can produce data in the form of RBuffer instances.
The sender doesn't actually "send" any data.
Instead it provides a buffer queue and can send
a notification to am RIReceiver when new data is
available. The receiver will be responsible for
extracting the data from the queue.
The only mandatory rule here is that once a receiver
received a notification, it must be able to fetch at
least one data buffer from the sender queue.
The sender does not need to have as many queued
data buffers as notifications sent. The notification
merely indicate that at least one data buffer is
available.
This way, a receiver may choose to either pool
the sender's queue repeatedly, or on the other
hand it may choose to drop notification events
if it doesn't need data, or to just keep a note
that some data can be fetched at a later time.
Consequently, depending on the nature of the queue
being managed, it may be up to the sender to remove
obsolete queued buffers that have not been used yet.
RSenderBase is a default base implementation of
RISender. It provides a default implementation
that has a buffer queue, can add a new buffer,
can remove the first available buffer, can purge
old buffers (when adding a new one) and performs
the event dispatch.
The RIReceiver interface describes the counterpart
to RISender, that is an entity that can receive
notifications when a sender has new buffers available.
RIReceiver derived classes do not actually "receive"
buffers. They are only notified when a sender
produces new buffers. The implementation here is
responsible for grabbing the buffers when appropriate
with the only guaranteed rule that at least one buffer
will be made available.
Implementations should also be careful in locking the
sender's queue when dequeuing buffers.
Here are various strategies to implement the callback:
Note that overall the sender/receiver pattern used here
could more appropriately be called a producer/consumer pattern.
The difference is subtle yet both give a clear idea of the
underlying workflow.
Now that I have the first part of the my workflow (i.e.
Recorder -> Player, 2 out of 9 items), I'm looking at performances and at first
I got scared. I run the application and I get a mere 5 frames per second!
Keep in mind that at this point the application does more or less:
The test application that comes with VideoCapture.Net
does the same at 15 fps so why do I get only 5 here??!
Well first I was running under VS.Net's debugger in debug mode.
By running it out of the debugger I get 12-13 fps, and by using the release version
I get one more fps. Then if I comment out the JPEG encoding/decoding and disable the
display I get one more fps so I effectively get almost 15 fps again.
So overall it's not at bad as it sounded at first. 15 fps max is OK considering
that the target is for an internet-based web chat, so I'll be happy if I can
transmit 1-5 fps overall.
Then again 100% CPU on an Athlon 64 3000+ to get 15 fps is not extremely fast either...
Looking with the excellent Application Profiler,
I see that the application spends almost 80% of its time in a new byte call
when receiving frames. That means there's a lot of room for optimization, in two
different ways.
First I want to insert a crude rate limiter. If I effectively limit the frame rate
to 5 fps, obviously I'll save whatever time was used by these extra 10 fps.
Ideally the application should only use 33% CPU if the frame rate is limited to
one third, right? Hmmm ;-)
Second there's some obvious optimization that can be done in
VideoCapture.Net
for my specific case.
Earlier I was describing the sender and receiver interface
which is used to implement the buffer flow management.
The purpose of this interface is to pass buffers around. The idea is that
buffers may potentially be heavy and one wants to limit as much as possible
how many buffers get created and moved around.
From an implementation point of view, this is all but a lightweight
interface. The sender will probably be implemented in most case as an
object instance running in its own thread with its own message queue.
The receiver will generally be another instance running in its own thread too.
One the other hand, for a completly different target, I may want a secondary
interface that achieves a similar result yet with different objectives:
In this interface, the focus is more on passing a message to target
than transmitting something.
The message will generally have data associated, yet it may not.
(..to be continued..)
1. Milestone 2
1.1. Goal
1.2. Design & Spec
1.3. Plan
2005xxxx [1.N] Classes: xx
2005xxxx [1.F] xx
20051023 [1.F] This file
1.4. Notes
Still it's a good idea to keep the functionality separated in different
projects to minimize interface pollution, especially since I will probably implement
different parts using different languages or frameworks:
2. Milestone 1
2.1. Goal
For example OpenH323 does all that but it does not apply.
It's far from easy to use imho and after having tried to use their OpenPhone demo app for
several months with the GNU Gatekeeper I'm disappointed a lot.
(Note that here I am not talking about the OpenPhone project,
just the application demo that uses OpenH323.)
The experiment part is: do I need to?
Could a naive implementation simply encode video using
JPEG 2000
and a standard GSM codec
(or other audio codec)
using only free or open libraries?
The first how is how would I implement it? The structure design and
the ability to actually implement the design as preconceived or not is in
fact more important for me that the resulting product of the implementation.
How do I want to design it, how does that design hold in real life,
and how does the design gets fixed/adapted to meet the goal?
The second with what means what existing open source projects, libraries
or material can help serve as a foundation? For example can I find existing
libraries to handle video acquisition, encoding, decoding, etc.?
Do I have to recode my own GSM codec or can I legally use an existing one in
a GPL context?
2.2. Description
2.3. Milestones
2.4. Plan
20050322 [1.N] Classes: RReceiverThread, create class
20050326 [1.N] Classes: RRecorderVideo, limit frame rate
20050130 [1.N] Classes: RICodec, create interface with Encode/Decode methods
20050130 [1.N] Classes: RCodecJpeg, Encode, Decode (full picture)
20050315 [2.N] Classes: RCodecJpeg, Encode, Decode use sub-blocks
20050130 [1.N] Classes: RInputMixer, create class
20050315 [1.N] Classes: RInputMixer, initial impl simply takes packets and forward to network
20050130 [1.N] Classes: ROutputDemux, create class
20050315 [1.N] Classes: ROutputDemux, initial impl simply takes packets from receiver
20050130 [1.N] Classes: RNetworkSender, create class
20050130 [1.N] Classes: RNetworkReceiver, create class
20050315 [1.N] Classes: RNetworkSender/Receiver, initiate connection
20050315 [1.N] Classes: RNetworkSender/Receiver, exchange packets
20050315 [1.N] Classes: RInputMixer, better impl splits packets match network sender requested size
20050315 [1.N] Classes: ROutputDemux, recombines splitted packets
20050315 [1.N] Classes: RNetworkSender report back to InputMixer
20050315 [1.N] Classes: RInputMixer, ability to drop packets or change quality
20050130 [1.N] Classes: RInputMixer, get data queues from several recorders and mix them
20050130 [1.N] Classes: ROutputDemux, demux data from mixer into several data queues
20050130 [1.N] Classes: ROutputDemux, feed data to players
20050130 [2.N] TestApp: Display meaningfull bandwidth stats
20050315 [1.N] Planning: Complete M0 initial task list (step 2)
20050130 [2.N] Classes: RRecorderText: create class, attach to UI
20050130 [2.N] TestApp: Plug to text field, implement start/stop
20050130 [2.N] Classes: RRecorderText, implement start, stop
20050130 [2.N] Classes: RRecorderText, capture text in data queue
20050130 [2.N] Classes: RPlayerText: create class, attach to main UI
20050130 [2.N] Classes: RPlayerText: get data buffer from RRecorderText
20050130 [2.N] Classes: RPlayerText: display data buffer
20050130 [2.N] Classes: RRecorderSound: create class, implement device enumeration
20050130 [2.N] TestApp: Use device enumeration, pref selection, implement start/stop
20050130 [2.N] Classes: RRecorderSound, implement start, stop
20050130 [2.N] Classes: RRecorderSound, capture images in data queue
20050130 [2.N] TestApp: Display sound level from RRecorderSound data queue and drop queue
20050130 [2.N] Classes: RPlayerSound: create class,, implement device enumeration
20050130 [2.N] TestApp: Use device enumeration, pref selection, implement start/stop
20050130 [2.N] Classes: RPlayerSound: get data buffer from RPlayerSound
20050130 [2.N] Classes: RPlayerSound: output data buffer
20050130 [2.N] Classes: RCodecGsm, Encode, Decode
20050130 [3.N] Daemon: Create skeleton daemon in Python
20050306 [4.N] Classes: RRecorderVideo, handle device changing when already started
20050306 [4.N] Classes: RIRecorder, need an enumerate/set Resolution functionnality.
20050306 [4.N] Classes: RRecorderVideo, implement enumarete/set Resolution.
20050306 [4.N] TestApp: Enumerate and get/set video resolution in prefs.
20050130 [5.N] Classes: RCodecJpeg, Sub-block based encoding
20050130 [5.N] Classes: RCodecJpeg2k, Encode, Decode
20050130 [5.N] Classes: RCodecLpc10, Encode, Decode
20050130 [6.N] TestApp: Ability to choose codec in prefs
20050327 [1.F] TestApp: Display frame rate
20050326 [1.F] Classes: RFreqCounter in Utils
20050325 [1.F] Classes: RDevFactory, TestApp: connect RRecorderVideo to RPlayerVideo local
20050325 [1.F] Classes: RPlayerVideo: async loop, display data buffer for local
20050325 [1.F] Classes: RPlayerVideo: create class, attach to main UI
20050325 [2.F] TestApp: Enable/disable audio/video depending on device availability
20050324 [1.F] Classes: RPlayerVideo: create class (skeleton + unit file)
20050324 [1.F] TestApp: Testing with smaller image size: 176x144.
20050323 [1.F] TestApp: Use RSenderBase in existing test app
20050322 [1.F] Classes: RSenderBase, create classes
20050322 [1.F] Classes: RISender, RIReceiver, create interfaces
20050322 [1.F] Classes: Moved RBuffer to Utils. Added RIBuffer interface.
20050320 [2.F] Classes: Adding inline doc comments for all classes and public methods
20050320 [2.F] Framework: Updated to use NUnit 2.2
20050315 [1.F] Planning: Complete M0 initial task list (step 1)
20050310 [2.F] TestApp: Recorder encoding as Jpeg in RBufferVideo and MainApp display Jpeg buffers
20050309 [2.F] Classes: Added RBufferVideo.FromBitmap
20050307 [1.F] TestApp: Fixed loading window size to work with auto resize window.
20050306 [1.F] TestApp: Automatic video preview resize. Auto resize window.
20050306 [1.F] TestApp: Display recorded images from RRecorderVideo data queue
20050306 [1.F] Classes: RRecorderVideo, capture images in data queue
20050306 [1.F] Classes: RRecorderVideo, implement start, stop
20050306 [1.F] TestApp: Updating preferences for video recorder device
20050306 [1.F] TestApp: Added preference window
20050306 [1.F] Classes: Added RDeviceFactory, fixed name RIEncoder as RIRecorder
20050306 [1.F] TestApp: Use device enumeration, pref selection, implement start/stop
20050306 [1.F] Classes: RRecorderVideo: create class, implement device enumeration
20050227 [1.F] Classes: RIRecorder, RIPlayer, create interfaces
20050227 [1.F] Classes: RBufferVideo, create class
20050221 [1.F] Classes: RBuffer, create class
20050220 [1.F] Classes: RDeviceInfo, create class
20050220 [1.F] Misc: Add to CVS
20050220 [1.F] Project: Create skeleton project for Network module
20050220 [1.F] Project: Create skeleton project for Codec module
20050220 [1.F] Project: Create skeleton project for Mixer module
20050220 [1.F] Project: Create skeleton project for Device module
20050220 [1.F] TestApp: Add logging display
20050220 [1.F] TestApp: Add preferences storage
20050220 [1.F] TestApp: Implement placeholder start/stop/pause/quit
20050220 [1.F] TestApp: Implement placeholders for videos, sound level, text, stats, log
20050220 [1.F] Project: Create skeleton project for test bed application
20050130 [1.F] Planning: M0 defined, initial task list
20041221 [1.F] This file
2.5. Milestone 1
2.6. Milestone 2
2.7. Notes
+----------+ +-------+ +--------+ +----------+ +-------+ +--------+
| Recorder | => | Mixer | => | Sender | => ( net ) => | Receiver | => | Demux | => | Player |
+----------+ +-------+ +--------+ +----------+ +-------+ +--------+
^ ^ | ^
+---------+ | | | +---------+ |
| Encoder |--+ +---<<---+ | Decoder |--+
+---------+ (feedback) +---------+
If FIX wants to initiate, the proxy tells FIX to open the connection towards DYN.
The bottom line is that the proxy always tell the local network's client to open
the connection, which means the NAT will automatically send feedback packets
accordingly.
This is of course similar to what MPEG and H.261-H.263 codecs do: they split
an image in macro blocks, encode them separately using DCT, perform more or less
vector prediction/compensation on these and transmit only a subset of the DCT
coefficient to match the bandwidth.
I'm not going there. That is I am not reimplementing H.261 -- if I really wanted
to do that I would just use an existing codec from the
OpenH323 project for example.
Yet what would that manipulation be? My goal is not to do a DCT-like encoding
manually. I'm not trying to reinvent JPEG nor MPEG here. Instead what I want
for my first prototype is to manipulate bitmaps "as-is". That means I just need
to keep the Bitmap object from .Net, then encoding it as JPEG using the internal
methods and finally stream that JPEG data file directly in the RBufferVideo.
The second part of the encoder/decoder will be to work with the same principle
but on a sub-block basis. Same I don't even have to try to get the bits of the
bitmap, I just need to use the JPEG encoder/decoder available right in .Net.
Currently the hack is that RBufferVideo as a method which received a Bitmap,
encodes it as JPEG and store the stream. Then there's a reverse function which
given a data stream decodes the JPEG and create an Image object.
The video recorder directly places the JPEG buffers in its queue and the MainApp
grabs them on a timer.
The test shows that the bandwidth is way over the 160 kilobit limits, no big
surprise here.
Instead I prefer an event-based model. The recorder enqueues a packet and fires an
asynchronous event. When starting, the main application creates the recorder and the
codec's encoder and registers the encoder as receiving the event from the recorder.
It should also tell the codec where to fetch the data from.
Basically I can use two interfaces: one interface that publishes a data queue and can
register a delegate event, and another interface that publishes an event receiver
and can told which object has the data queue.
This way only the main application contructs the graph and each module knows nothing
of the others.
+----------+ +-------+ +--------+ +----------+ +-------+
| Recorder | | Mixer | => | Sender | => ( net ) => | Receiver | => | Demux |
+----------+ +-------+ +--------+ +----------+ +-------+
| ^ ^ | |
| +---------+ | | | +---------+ |
+-->| Encoder |--+ +---<<---+ | Decoder |<-+
+---------+ (feedback) +---------+
| |
v v
+--------+ +--------+
| Player | | Player |
+--------+ +--------+
(local) (remote)
Utils
|
+-------+-------+--------+
| | | |
v v v v
Device Mixer Codec Network
| | | |
+-------+-------+--------+
|
XeresLib |
| |
+-------+
|
v
XeresApp
Utils
|
+-------+------------------+
| | |
| | v
| | +-------+-------+--------+
| | | | | |
v | v v v v
XeresLib | Device Mixer Codec Network
| | | | | |
| | +-------+-------+--------+
| | |
| | |
| | |
+-------+------------------+
|
v
XeresApp
|
v
(TestsConsole)
Another way to think of it: the utils module is actually copied from
the AppSkeleton project when creating a new application. Thus all changes
made to this module can potentially be back ported to the AppSkeleton
project and consequently must not depend on anything app-specific.
OTOH what I did is that each module has its own namespace in the test suite,
that is TestsConsole.Utils, TestsConsole.Devices, etc. This means it will be
easy to back port the TestsConsole.Utils suite to the AppSkeleton project
later as needed. It also means that if I ever wanted to run a sub part
of the tests I could create a suite based on the namespace.
Part of the problem is that I fail to see how to test anything UI-related.
I do not think that a unit test can properly check that a system call
actually opened a window or that a widget displays the proper way.
There are a subset of action/reaction tests that could be performed, such as
simulating a click on a button by calling its callback function and checking
whether some other UI item has been checked or activated... only internal properties
could be validated and there's no guarantee that the visual output matches the
internal state.
UI-related tests should be handled by a whole different testing strategy IMHO
and I don't have any plans nor idea for such right now. Something like QA
with sequential test plans comes to mind.
Once I have this basic framework, I can actually write test-first by expanding
the test, see it fail and expand the code. The difference with the original
concept is merely an implementation detail IMHO.
Since my workflow is nowhere near completion, the yet-to-be-done processing will
need some non negligible CPU time thus it is not acceptable to use 100% CPU right now.
Currently the VideoCaptureDevice::OnSample callback does this:
So there are 3 objects allocated and two data copies done each time.
None of the object allocations are actually useful and only one data copy
is actually needed.
The EventArgs class could be instantiated once first when the device is
enabled and then reused for each callback call. Since the code makes sure not to
reenter the callback if already in it, it merely needs to make sure the EventArgs
data is not changed while the callback is being called.
Then the callback itself does not need to re-allocate the bitmap nor the
buffer each time. They can be allocated once with the EventArgs and reused
as long as it is made clear by the usage semantics that the bitmap should not
be disposed by the client callback. Then again depending on the client's
usage, only one of the raw data byte array or the bitmap object are really
needed.
So we'll see. I need to reinstall the DirectX 9 SDK and if I manage to
recompile VideoCapture.Net
I'll probably tweak it for my own needs.
Since the implementation is .Net/C#, the only thing that really gets
transfered between a sender and a receiver is a reference, or pointer.
Yet somehow from my point of view the semantic of a buffer is that it is
something "expensive" to manipulate and the semantic of the transfer is more
about transfering ownership than real data.
So the interface is build to minimize what gets transfered: the sender owns
whatever buffers it makes available till a receiver grabs them.

This work is licensed by Raphaël Moll under a Creative Commons License.
Color Theme:
Gray
| Blue
| Black | Sand
| Khaki
| Egg
| None
541 accesses, 1 access from 38.103.63.58
Visited 30 times by Google, last 2008/11/12 20:09
Visited 74 times by Yahoo!, last 2008/11/13 04:52
Visited 16 times by Teoma, last 2008/11/02 12:30
Visited 25 times by MSN, last 2008/11/08 19:46