CompiledShaderState.h

Show log
Commit

Hash : 93b97a59
Author :
Date : 2023-11-03T22:07:23

Make link job directly wait on compile job

Previously, program link waited on the compile job on the calling thread
before launching the link job.  As a result, sequences of intermixed
compile and link would get largely serialized as such:

Main Thread       Thread 1       Thread 2       Thread 3       Thread 4
 Compile -------> Compile
 Compile -----------|----------> Compile
 Link               |              |
   Wait             |              |
   |                |              |
   |<--------------/--------------/
   \------------------------------------------> Link
 Compile -------> Compile                        |
 Compile -----------|----------> Compile         |
 Link               |              |             |
   Wait             |              |             |
   |                |              |             |
   |<--------------/--------------/              |
   \---------------------------------------------|-----------> Link
 Compile -------> Compile                        |              |
 Compile -----------|----------> Compile         |              |
 Link               |              |             |              |
   Wait             |              |             |              |
   |                |              |             |              |
   ...

With this change, the main thread no longer waits for compilation to
finish.  It's the link job itself that does the waiting.  This allows
the main thread to go through Compile and Link commands without
blocking, generating as many jobs as needed.  The above scenario
therefore becomes:

Main     T1     T2     T3     T4     T5     T6     T7     T8     T9
 C ----> C
 C ------|----> C
 L ------|------|----> L
 C ------|------|-------W---> C
 C ------|------|-------|-----|----> C
 L ------|------|-------|-----|------|----> L
 C ------|------|-------|-----|------|-------W---> C
 C ------|------|-------|-----|------|-------|-----|----> C
 L ------|------|-------|-----|------|-------|-----|------|----> L
 .        \-----\------>/     |      |       |     |      |       W
 .                     |       \-----\------>/     |      |       |
 .                     |                    |       \-----\------>/
 .                     |                    |                    |
 .                     |                    |                    |

This greatly improves the amount of parallelism compile and link jobs
get.

The careful observer may note that the link job being blocked on the
compile job is now wasting a thread from the thread pool.  While this
change is strictly an improvement, parallelism can be further improved
if the link job is just not assigned to a thread until the corresponding
compile jobs are finished.  This is currently not possible, but may be
if:

- Instead of a thread pool, the operating system's FIFO scheduler is
  used.  Then the operating system would automatically put blocking
  tasks to sleep and pick up another task.  This has the downside of
  requiring threads to be created for each task.
- The thread pool work scheduler is enhanced to be made aware of
  relationship between tasks and avoid scheduling jobs whose
  dependencies are not yet met.

Alternatively, the number of threads in the pool can be increased by 30%
and hope for the best.

Bug: angleproject:8297
Change-Id: If4e6540ade47558a10cfab55e2286f073b904928
Reviewed-on: https://chromium-review.googlesource.com/c/angle/angle/+/5006874
Commit-Queue: Shahbaz Youssefi <syoussefi@chromium.org>
Reviewed-by: Geoff Lang <geofflang@chromium.org>
Reviewed-by: Charlie Lao <cclao@google.com>

Download

src/common/CompiledShaderState.h

//
// Copyright 2022 The ANGLE Project Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
//
// CompiledShaderState.h:
//   Defines a struct containing any data that is needed to build
//   a ShaderState from a TCompiler.
//

#ifndef COMMON_COMPILEDSHADERSTATE_H_
#define COMMON_COMPILEDSHADERSTATE_H_

#include "common/BinaryStream.h"
#include "common/Optional.h"
#include "common/PackedEnums.h"

#include <GLSLANG/ShaderLang.h>
#include <GLSLANG/ShaderVars.h>

#include <memory>
#include <string>

namespace sh
{
struct BlockMemberInfo;
}

namespace gl
{

// @todo this type is also defined in compiler/Compiler.h and libANGLE/renderer_utils.h. Move this
// to a single common definition?
using SpecConstUsageBits = angle::PackedEnumBitSet<sh::vk::SpecConstUsage, uint32_t>;

// Helper functions for serializing shader variables
void WriteShaderVar(gl::BinaryOutputStream *stream, const sh::ShaderVariable &var);
void LoadShaderVar(gl::BinaryInputStream *stream, sh::ShaderVariable *var);

void WriteShInterfaceBlock(gl::BinaryOutputStream *stream, const sh::InterfaceBlock &block);
void LoadShInterfaceBlock(gl::BinaryInputStream *stream, sh::InterfaceBlock *block);

bool CompareShaderVar(const sh::ShaderVariable &x, const sh::ShaderVariable &y);

struct CompiledShaderState
{
    CompiledShaderState(gl::ShaderType shaderType);
    ~CompiledShaderState();

    void buildCompiledShaderState(const ShHandle compilerHandle, const bool isBinaryOutput);

    void serialize(gl::BinaryOutputStream &stream) const;
    void deserialize(gl::BinaryInputStream &stream);

    const gl::ShaderType shaderType;

    int shaderVersion;
    std::string translatedSource;
    sh::BinaryBlob compiledBinary;
    sh::WorkGroupSize localSize;

    std::vector<sh::ShaderVariable> inputVaryings;
    std::vector<sh::ShaderVariable> outputVaryings;
    std::vector<sh::ShaderVariable> uniforms;
    std::vector<sh::InterfaceBlock> uniformBlocks;
    std::vector<sh::InterfaceBlock> shaderStorageBlocks;
    std::vector<sh::ShaderVariable> allAttributes;
    std::vector<sh::ShaderVariable> activeAttributes;
    std::vector<sh::ShaderVariable> activeOutputVariables;

    bool hasClipDistance;
    bool hasDiscard;
    bool enablesPerSampleShading;
    gl::BlendEquationBitSet advancedBlendEquations;
    SpecConstUsageBits specConstUsageBits;

    // GL_OVR_multiview / GL_OVR_multiview2
    int numViews;

    // Geometry Shader
    Optional<gl::PrimitiveMode> geometryShaderInputPrimitiveType;
    Optional<gl::PrimitiveMode> geometryShaderOutputPrimitiveType;
    Optional<GLint> geometryShaderMaxVertices;
    int geometryShaderInvocations;

    // Tessellation Shader
    int tessControlShaderVertices;
    GLenum tessGenMode;
    GLenum tessGenSpacing;
    GLenum tessGenVertexOrder;
    GLenum tessGenPointMode;
};

using SharedCompiledShaderState = std::shared_ptr<CompiledShaderState>;
}  // namespace gl

#endif  // COMMON_COMPILEDSHADERSTATE_H_

Download