Video automation testing at Skype


         Pierre Gronlier - pierre.gronlier@skype.net

Video Software Development Engineer in Test - Microsoft Skype division



                        April 2012 - Kiev
1   The Video Library
2   Continuous Integration
      Building
      Testing
      Feedback
3   Unit, Component, System testing
      Some wrappers for testing.
      Test Driven Development
4   Cross-platform testing
      CI team
      Plugin mechanisms
5   NFR
      Denition
      KPIs
      Increase visibility
6   Conclusion
The Video Library
What is Skype made of ?


                                     UI

                                  Network

                    Video          Audio      Messaging



                            Figure: Inside Skype
What is Skype made of ?


                                                   UI

                                                Network

                                 Video           Audio         Messaging



                                         Figure: Inside Skype


                              Video Codec      Streaming         ToolBox
                                               Platforms
                              Apple, Android, Windows, Linux, Embedded, ...


                                 Figure: Inside the Video Library

  Platforms contains specic code like capturing, rendering methods.
Continuous Integration
Continuous integration means :
    building continuously.

    testing continuously.

    having an immediate feedback.
Quickbuild

  http://www.pmease.com/features/




                                Figure: Quickbuild
Quickbuild
     There is, only for the Video Library, around 20 dierent build congurations for dierent
     platforms and compilation modes.
         release/debug
         internal/external
         stable/experimental
         ...
     We have a farm of building computers.
     To enable compilation and maintenance across platforms, Makefile is used for compiling
     and farm agents are in Java




                                       Figure: HeatMap
Cross branches builds




   Example

  Network :             Video :              Codec :
      trunk/                trunk/               trunk/
      branches/             branches/            branches/
          network-69              video-42             codec-23
          network-68              video-41             codec-22
          ...                     ...                  ...
Cross branches builds


   Example

  Network :                        Video :                            Codec :
      trunk/                            trunk/                            trunk/
      branches/                         branches/                         branches/
          network-69 *                       video-42                           codec-23 *
          network-68                         video-41 *                         codec-22
          ...                                ...                                ...
       To enable two dierent dependent teams to develop new features without becoming
       incompatible, we compile our code with the latest stable release of the dependencies.

       In addition to trunk source code, we build our latest Long Term Support branch (*) every
       time there is a backport of a x.
Cross branches builds

   Example

  Network :                      Video :                        Codec :
      trunk/                         trunk/                         trunk/
      branches/                      branches/                      branches/
          network-69 *                     video-42                        codec-23 *
          network-68                       video-41 *                      codec-22
          ...                              ...                             ...

                   Mode                    Network      Video    Codec
                   Video stable              ∅       video-41   codec-23
                   Video release             ∅         trunk    codec-23
                   Video experimental        ∅         trunk      trunk
                   Network release         trunk     video-41   codec-23
                   Network experimental    trunk       trunk      trunk
CI as a daily tool




   Continuous integration means that :
     1 every 10 mins, a script checks for new commits on video trunk/ or the branches/.

     2 once a build for a platform is done successfully, it triggers a list of short tests. Every test

       lasts around 30 seconds.
     3 at night, a list of longer tests is executed.

     4 for every test execution, a report is generated in a database and the results are aggregated

       on a web page for Devs and QEs
CI as a daily tool




                     Figure: Test results
The importance of visual feedback




                   Figure: TVs with build/test feedback

                           Make it visible ! !
Unit, Component, System testing
Who writes and maintains the tests ?
       Writing tests is writing code.
       When you automate testing, QE are software developers in test.
  The closer and deeper you get into the production source code, more probably it will be a
  developer test.
                                              UI
                                                                         Python
                                           Network

                            Video            Audio         Messaging



                                          Figure: Inside Skype

                                                                         Lua, C#
                         Video Codec      Streaming         ToolBox
                                          Platforms                     Lua
                         Apple, Android, Windows, Linux, Embedded, ...



                                    Figure: Inside the Video Library
QE and Devs together




  1   Don't wait for developers to write your
      tests.

  2   Dene the tests when you dene the
      Acceptance Criteria of your PBI.

  3   Evaluate the value of your tests (e.g. code
      coverage).

  4   KISS : Keep it Stupid Short and Simple.



                                                    Figure: Test plan
Cross-platform testing
Requirements
  We want to have those features :
     run our tests on dierent platforms
     run our tests with dierent builds
     retrieve the results of our tests and analyze it
     save the result of the analysis
     output a report, trigger alarms

  The cross-platform CI team can provide :
      a pool of devices, platform and capture devices.
      access to various builds.
      provide uniform alarming systems (chat, email, sms)
      a database.
      a storage space.
              It is only a matter of contract denition between you and the CI team
How to conceive a modular testing framework ?
           Targets:
           - tablets, mobile, notebook w/ and w/o hardware encoding camera, desktop
           - Windows (desktop + mobile), Linux, Mac (desktop + mobile), Android

                                              +/-




                Full logs

                                     Insert/Update entry

                                                                                      Frontend
                                                               DataBase                Server




            Parsing
            Server                                                                    Web Rendering
                                 Reduced logs


                                                            Storage Server




                                           Figure: Framework
How to conceive a modular testing framework ?
           Targets:
           - tablets, mobile, notebook w/ and w/o hardware encoding camera, desktop
           - Windows (desktop + mobile), Linux, Mac (desktop + mobile), Android

                                              +/-




                Full logs

                                     Insert/Update entry

                                                                                      Frontend
                                                               DataBase                Server




            Parsing
            Server                                                                    Web Rendering
                                 Reduced logs


                                                            Storage Server




                                           Figure: Framework
NFR
What is non-functional ?




  Functional vs Non-Functional
                           the video works = we see something
                                            vs
                  the video has a good quality = we enjoy our video call
Key performance indicators

  list of kpis

      resolution and frame rate
      bitrate
      dropped frames and freeze durations
      frame-quality
      ...

  list of usecases
      for every codec
      for every media protocol version
      1-to-1 call and Group Video Calling
      software encoding vs hardware encoding
      for dierent network conditions
Pass/Fail vs Score



                                    Video Call A        Network Emulation        Video Call B
                               with controlled inputs                       with analyzed outputs




                 x                    x
    1. VGA = 640 480, QVGA = 320 240, QQVGA = 160 120               x
    2. Image Quality measurement algorithms
Pass/Fail vs Score



                                     Video Call A        Network Emulation        Video Call B
                                with controlled inputs                       with analyzed outputs



  Example   :
                     KPI                   Functional                                   Non-functional
                                             pass/fail                                   0% → 100%
                  resolution                  = 0x0                                     max = VGA 1
                  framerate                    =0                                        max = 15fps
                    bitrate       in the range of [20..5000]kb                         350kbps ± 10 %
                frame-quality              frame exist                               PSNR or SSIM 2 score
                       .
                       .                         .
                                                 .                                            .
                                                                                              .
                       .                         .                                            .
    Everything is automated using stats and feedback values from the Video Library.

                 x                     x
    1. VGA = 640 480, QVGA = 320 240, QQVGA = 160 120                x
    2. Image Quality measurement algorithms
How to evaluate the best available quality for a call ?

  The best quality of a call is given by :
                              optimal settings   = gcd(sender , receiver )

  with
             sender     = gcd (max (Encoding     power   ) , max (Network) , max (Camera))
             receiver   = gcd (max (Decoding     power   ) , max (Network) , max (Screen))




   (gcd = greatest common divisor)
How to evaluate the best available quality for a call ?

  The best quality of a call is given by :
                              optimal settings   = gcd(sender , receiver )

  with
             sender     = gcd (max (Encoding     power   ) , max (Network) , max (Camera))
             receiver   = gcd (max (Decoding     power   ) , max (Network) , max (Screen))
   where, with some simplications,
              Encoding power      = f1 (CPU power, Power supply mode, Codec        perf.)
              Network             = f2 (Bandwidth, RTT, Relay/P2P)
              Camera              = f3 (Resolution, Framerate)
              Decoding power      = f4 (CPU power, Power supply mode, Codec        perf.)
              Screen              = f5 (Resolution)

   (gcd = greatest common divisor)
Compare across revisions / branches
Compare across revisions / branches
Conclusion
Summary



   1   Quick feedback between development and testing.

   2   Devs and QE in the same team.

   3   Collocation helps a lot !

   4   Don't over-complicate your tests/frameworks.

   5   Measure the eciency/value of your tests.
Questions

 1   The Video Library
 2   Continuous Integration
       Building
       Testing
       Feedback
 3   Unit, Component, System testing
       Some wrappers for testing.
       Test Driven Development         Äÿêóþ çà óâàãó !
 4   Cross-platform testing
       CI team
       Plugin mechanisms                Çàïèòàííÿ ?
 5   NFR
       Denition
       KPIs
       Increase visibility
 6   Conclusion

Skype testing overview

  • 1.
    Video automation testingat Skype Pierre Gronlier - pierre.gronlier@skype.net Video Software Development Engineer in Test - Microsoft Skype division April 2012 - Kiev
  • 2.
    1 The Video Library 2 Continuous Integration Building Testing Feedback 3 Unit, Component, System testing Some wrappers for testing. Test Driven Development 4 Cross-platform testing CI team Plugin mechanisms 5 NFR Denition KPIs Increase visibility 6 Conclusion
  • 3.
  • 4.
    What is Skypemade of ? UI Network Video Audio Messaging Figure: Inside Skype
  • 5.
    What is Skypemade of ? UI Network Video Audio Messaging Figure: Inside Skype Video Codec Streaming ToolBox Platforms Apple, Android, Windows, Linux, Embedded, ... Figure: Inside the Video Library Platforms contains specic code like capturing, rendering methods.
  • 6.
  • 7.
    Continuous integration means: building continuously. testing continuously. having an immediate feedback.
  • 8.
  • 9.
    Quickbuild There is, only for the Video Library, around 20 dierent build congurations for dierent platforms and compilation modes. release/debug internal/external stable/experimental ... We have a farm of building computers. To enable compilation and maintenance across platforms, Makefile is used for compiling and farm agents are in Java Figure: HeatMap
  • 10.
    Cross branches builds Example Network : Video : Codec : trunk/ trunk/ trunk/ branches/ branches/ branches/ network-69 video-42 codec-23 network-68 video-41 codec-22 ... ... ...
  • 11.
    Cross branches builds Example Network : Video : Codec : trunk/ trunk/ trunk/ branches/ branches/ branches/ network-69 * video-42 codec-23 * network-68 video-41 * codec-22 ... ... ... To enable two dierent dependent teams to develop new features without becoming incompatible, we compile our code with the latest stable release of the dependencies. In addition to trunk source code, we build our latest Long Term Support branch (*) every time there is a backport of a x.
  • 12.
    Cross branches builds Example Network : Video : Codec : trunk/ trunk/ trunk/ branches/ branches/ branches/ network-69 * video-42 codec-23 * network-68 video-41 * codec-22 ... ... ... Mode Network Video Codec Video stable ∅ video-41 codec-23 Video release ∅ trunk codec-23 Video experimental ∅ trunk trunk Network release trunk video-41 codec-23 Network experimental trunk trunk trunk
  • 13.
    CI as adaily tool Continuous integration means that : 1 every 10 mins, a script checks for new commits on video trunk/ or the branches/. 2 once a build for a platform is done successfully, it triggers a list of short tests. Every test lasts around 30 seconds. 3 at night, a list of longer tests is executed. 4 for every test execution, a report is generated in a database and the results are aggregated on a web page for Devs and QEs
  • 14.
    CI as adaily tool Figure: Test results
  • 15.
    The importance ofvisual feedback Figure: TVs with build/test feedback Make it visible ! !
  • 16.
  • 17.
    Who writes andmaintains the tests ? Writing tests is writing code. When you automate testing, QE are software developers in test. The closer and deeper you get into the production source code, more probably it will be a developer test. UI Python Network Video Audio Messaging Figure: Inside Skype Lua, C# Video Codec Streaming ToolBox Platforms Lua Apple, Android, Windows, Linux, Embedded, ... Figure: Inside the Video Library
  • 18.
    QE and Devstogether 1 Don't wait for developers to write your tests. 2 Dene the tests when you dene the Acceptance Criteria of your PBI. 3 Evaluate the value of your tests (e.g. code coverage). 4 KISS : Keep it Stupid Short and Simple. Figure: Test plan
  • 19.
  • 20.
    Requirements Wewant to have those features : run our tests on dierent platforms run our tests with dierent builds retrieve the results of our tests and analyze it save the result of the analysis output a report, trigger alarms The cross-platform CI team can provide : a pool of devices, platform and capture devices. access to various builds. provide uniform alarming systems (chat, email, sms) a database. a storage space. It is only a matter of contract denition between you and the CI team
  • 21.
    How to conceivea modular testing framework ? Targets: - tablets, mobile, notebook w/ and w/o hardware encoding camera, desktop - Windows (desktop + mobile), Linux, Mac (desktop + mobile), Android +/- Full logs Insert/Update entry Frontend DataBase Server Parsing Server Web Rendering Reduced logs Storage Server Figure: Framework
  • 22.
    How to conceivea modular testing framework ? Targets: - tablets, mobile, notebook w/ and w/o hardware encoding camera, desktop - Windows (desktop + mobile), Linux, Mac (desktop + mobile), Android +/- Full logs Insert/Update entry Frontend DataBase Server Parsing Server Web Rendering Reduced logs Storage Server Figure: Framework
  • 23.
  • 24.
    What is non-functional? Functional vs Non-Functional the video works = we see something vs the video has a good quality = we enjoy our video call
  • 25.
    Key performance indicators list of kpis resolution and frame rate bitrate dropped frames and freeze durations frame-quality ... list of usecases for every codec for every media protocol version 1-to-1 call and Group Video Calling software encoding vs hardware encoding for dierent network conditions
  • 26.
    Pass/Fail vs Score Video Call A Network Emulation Video Call B with controlled inputs with analyzed outputs x x 1. VGA = 640 480, QVGA = 320 240, QQVGA = 160 120 x 2. Image Quality measurement algorithms
  • 27.
    Pass/Fail vs Score Video Call A Network Emulation Video Call B with controlled inputs with analyzed outputs Example : KPI Functional Non-functional pass/fail 0% → 100% resolution = 0x0 max = VGA 1 framerate =0 max = 15fps bitrate in the range of [20..5000]kb 350kbps ± 10 % frame-quality frame exist PSNR or SSIM 2 score . . . . . . . . . Everything is automated using stats and feedback values from the Video Library. x x 1. VGA = 640 480, QVGA = 320 240, QQVGA = 160 120 x 2. Image Quality measurement algorithms
  • 28.
    How to evaluatethe best available quality for a call ? The best quality of a call is given by : optimal settings = gcd(sender , receiver ) with sender = gcd (max (Encoding power ) , max (Network) , max (Camera)) receiver = gcd (max (Decoding power ) , max (Network) , max (Screen)) (gcd = greatest common divisor)
  • 29.
    How to evaluatethe best available quality for a call ? The best quality of a call is given by : optimal settings = gcd(sender , receiver ) with sender = gcd (max (Encoding power ) , max (Network) , max (Camera)) receiver = gcd (max (Decoding power ) , max (Network) , max (Screen)) where, with some simplications, Encoding power = f1 (CPU power, Power supply mode, Codec perf.) Network = f2 (Bandwidth, RTT, Relay/P2P) Camera = f3 (Resolution, Framerate) Decoding power = f4 (CPU power, Power supply mode, Codec perf.) Screen = f5 (Resolution) (gcd = greatest common divisor)
  • 30.
  • 31.
  • 32.
  • 33.
    Summary 1 Quick feedback between development and testing. 2 Devs and QE in the same team. 3 Collocation helps a lot ! 4 Don't over-complicate your tests/frameworks. 5 Measure the eciency/value of your tests.
  • 34.
    Questions 1 The Video Library 2 Continuous Integration Building Testing Feedback 3 Unit, Component, System testing Some wrappers for testing. Test Driven Development Äÿêóþ çà óâàãó ! 4 Cross-platform testing CI team Plugin mechanisms Çàïèòàííÿ ? 5 NFR Denition KPIs Increase visibility 6 Conclusion